Pair Programming with Microsoft's Damian Edwards - Retrieving and parsing JSON with .NET 6

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey friends i'm scott hanselman and um i'm here with damian edwards and we're going to try something because we did some live coding with our friend john uh a couple of days ago where we did some work and go and we learned how to use visual studio code with live share and docker so i thought that if you all like these kinds of things we'd solve some real problems so demo here is a real problem that i have i'm going to share my screen here and i'm going to say azure friday.com this is a website that i built with my buddy uh in nigeria and we went and made azure friday.com currently right now goes to an uh an azure website when i get it working perfectly i'll make the domain go to the right place all right okay so my buddy and i uh uh oluwa karede and uh i made this website and this would go and take a feed from channel nine it would take a feed from channel nine and uh then load it up into a single json file hold it in memory and we had this really cool javascript client-side thing where i could start typing and it would immediately search and it was super fast because there's only 700 episodes and 700 you know is it's like a meg it's nothing it was nothing right and here's what the url used to look like okay you would go to channel nine dot msdn slash api i had a guide and then i worked with the channel nine team and they added a thing called format equals taco because they had paging and i'm not a fan of paging apis i have 700 episodes of azure friday i don't want to get it 30 at a time okay here's where things go south if we click on that right now it redirects because channel 9 is joining microsoft learn and over the next several months they're going to be you know fixing a lot of these redirects but the point is in the short term it broke my uh my little you know fun website here so i called our friends over there and asked them for some advice and here's a piece of the email here so let's kind of parse this a bit you can use the docs hierarchy service to fetch a list of episodes for the show so the show still exists you have a api whatever slash episodes but it has paging okay so page zero two whatever we'll learn about that the service has a page size limit of of uh 30. uh then they said yeah you can diagnose distance and cross cross origin whatever let's take a look at this url here and see what we're dealing with so we drop that in a json view so it's a little bit prettier here we'll zoom in a little bit more and it looks like you've got an entry id a title a url presumably the base of this url is docs.microsoft.com so we can test that by going docs.microsoft.com that in fact works so now that's that's going to dictate our our our um our link what we don't have is i don't see any mp4 files here i don't see any iframes or anything like that okay yep um i don't see a thumbnail which i would love to have there's no media at all no media so it says to get the media you take the entry id and you fan out so here's some media you know this is like a document right this is a document database so this is all the information about the document and here we go look at that captions thumbnails our all the stuff that we need which is great but now i have to go and dig those things out right so what was a simple get call to a rest api now turns into a kind of cartesian product of illusion of rest calls cascading goals cascading calls indeed now that's a really common problem i think people have these issues uh you know happen this is just part of life now they do offer a a batch thing where i could say here's one id and another id okay but i have a tendency yeah i have a tendency to abuse apis and you know of course the obvious next step is oh i'll just put a url with 700 entry ids concatenated at the end that'll be fine right but no doubt that'll break something and it makes someone angry right but it does work for two okay and it does give you the metadata still that you and it does give me a better date it's literally a batch api okay okay so that's good information so if we go back here because one of the things is you don't just start coding you got to do the analysis right we look at the first page it looks like we've got 3 4 5 6 7 8 9 10 11 12 13 14. what do we got here a ton maybe we're getting 30 no not that many actually you could stick in vs code and search for uid and you would get the net the number of instances of your uid would give you the episode count is that what you're looking for yeah so this is one of those things where it's like you go to postman do you go to nightingale do you go to curl like what i want is the length of the episodes array so i could go in here yeah i could go and drop in that url i could hit go it says page size 30. now this says episode count 351 there's supposed to be 700 episodes i'll deal with that later okay what i want is that length how did you think i could get it i would take the content and put it obvious code and then do a string search on uid because there's in one instance of that per episode oh you know the the property name there okay so let's do this and let's make a new folder a new file rather drop it in and i'd search for the you know how many ui uid quote just to manipulate 32 the page size is 30. so that makes sense then you might be getting a couple of other hits oh no it's 30 with quotes yeah there you go so in their context page size is a maximum of 30 episodes so we're going to have to call that you know 700 divided by 30 times to go and figure that out do you know why the total's wrong or are you just not worried about it right now it's more likely that they have picked a date an arbitrary date like 2009 or something and just said we're not going to show any of these old ones it could be that they retired things that were old or used the old version of the portal or whatever you know it's half of the episodes are there half or not could be any number of reasons but i'm assuming that as new episodes come in that'll that'll change okay what we need to do then grab to start one page if i go and say page one there we go so here's the next thirty page one page two and i just keep the page size so it's nice i can just put a page number here did you did you get docs for the max page size or did you already figure that out no this was given as an example in an email okay so can you just like change it to 50 and see what happens that's a good question so let's go and say page zero age 700 i was told in the docs that it is yeah each size exceeds maximizing 30. it is a good area that's very nice can you uh the next thing i would try is go to your back have you got the batch ones before we do that i'm going to pick a page way outside okay so we'll get an empty array if we go outside so we know when to stop yep we go until we hit an empty array oh that's interesting go back to okay so this one go to the bottom just collapse episodes sorry okay okay so they're not doing what is what is um you sometimes see in apis like hyper media style apis yep a proper arresting yeah okay so they're not doing that okay so we need you we've come up with the mechanism which is a great point and then let's just make sure that we hit that again for folks that are listening in a in a classical rest api you'd have a next link in previous link at the bottom with a url and you'd simply follow it you'd teach your thing to go next next next until there was no more next maybe i'm not sure if i'm i'm not sure if that's i would call that classical but i'm sorry that was instilled in the original the hypermedia style api mm-hmm where you're expected to to navigate around um the api space effectively you know based on urls or uri but yeah okay so we go and do end number of those so then the question is i'm thinking if i get 30 per yeah if i get 30 per can i take the batch right example here which is using um two and let's make an example of that times 15. well the first thing i would try can you hit batch without the ids property just to see what it says validation errors ids field is required these are great error messages that's this is wonderful um can you give it an ids that isn't a valid id see what happens nice okay lovely lovely very good yeah yeah so then let's do this let's take ids let's add a comma at the end that's that is two batch ids that's not very helpful why are you doing that i just want this what's happening which this is going this editor is not working for me let's go over here what are you actually trying to do i'm not even sure what you're doing what i'm going to do is i'm going to fake a call with 30 ids to see if i can i'm just going to give it i get the same two ideas over and over 10 12 14 16 18 22 4 6 8 30. 30 ideas right there i know but i want to see if they're smart yeah no oh my okay they're not they're not thinking those but they allowed 30. okay well it didn't fail you should verify that there is actually 30. good point so let's go and take the result of that to your point do the same thing do it in nightingale do it anywhere a pretty funky url take the raw stuff bring it back and i'm sure that there's better ways someone's going to be watching this and say you said use awk use regex you learned van gaal you know pipe to this pipeline yeah unfortunately those are called id not uid so it's a little harder to count them there you go that's unique there's 30 of them so it did return 30. so we've confirmed a couple of things here we've confirmed that they have a batch thing so we can call this end times and for each page grab the thumbnail and or the the the thing i think we'll just get the thumbnail and send them to the page so all we need okay are id name text thumbnail that comes from elsewhere i on a hospital thumbnail and that's basically it so the url then we can get up top that media one is going to require this so we need 30 you know we need you know one call and then another call for the batch yeah one main call one batch and then we do that n times where n is you know max pages divided by 30. so in this context 15 times this cost episode count divided by 30. yeah max episode count sorry and this is i thought it was 700 but it's going to be 350. so that's that's not too bad so it's better than we thought it would be it'll be 15 calls twice yup makes sense so here's the thing that we want our new and early and career people to understand certainly you and i are seasoned i will say is a reasonable statement but certainly not experts this is the number one thing when you're doing this kind of analysis we've been very systematic we've tested our edge cases we've said well this is a good idea that's a good idea things like that but what we um aren't recognizing it is that there's 50 ways to do this isn't there yeah absolutely that's a super important thing for people to understand you don't have one more age case before we move on yeah yeah and actually before we do that you know what i just realized this is a silly thing a little detail i'm going to flip myself okay i'm flipping myself horizontally because when i point i want to point here and before i was looking away see yes that's good okay what would you want okay let's just see i'm going to assume that the batch is limited or maybe not to the same page size can you make it 31 instead of 30. that's a great one that would be a very interesting thing because then you might be limited to the extent the size of a url which is too okay it depends on the browser technically but 2k at least usually there's 31. so it didn't fail okay that's interesante might the batch api support more could we send 340 that's what i'm wondering oh my goodness which is great because it'll work until it totally does 31. wow all right well so then we now we have to go and break it even more yes now put 300 in and see what you get there's 30 comma 16 31 oh 31 21 62 93 awesome that's should be fine add requests might be that last comma no didn't like yeah interesting that's a and notice the uh it sent you back an html response like we really broke it yeah well that's a 500 basically somebody way earlier probably um azure or i made a mistake in my pasting possible i mean it's it's just a url though these are just query string values so pretty long though 404 interesting different errors as it gets longer and longer and at this point you don't know how many servers are oh now i've i may have been blocked the resource has been removed its name has changed are you mad at me now it's very there's a very good chance that it whoops i may have been blocked by azure front door at this point i think you're you're you're mounting a denial service we're back all right yeah they definitely don't like that maybe let's just try 50. i think i made a mis no i don't think i made a pasting mistake okay i'm pretty sure we shouldn't try our luck with uh i mean we know 31 works maybe we can just be happy that we've got the page size working and we'll just go with that i mean we could probably go 50 or 60 or whatever but at some point we're going to anger the beast yeah but then again to juxtapose this thing i was saying before about how there's lots of ways to do this and this is where i think i get a little analysis paralysis yeah because i could do this in powershell i could do this as a series of bash scripts what's the what's the goal here right is this a dynamic um api where i fake the old api and try to then fan out to hundreds of others or is this a job that runs you know once a day or once a week my perspective is in the short term it'll be a console app in the long term maybe we'll make it a an azure function or a cron job or something that runs in the background i think what we could do is assemble a json file that is roughly shaped the way we want it and once a week or whenever a show is published drop it into storage and grab it out of storage that would be a reasonable thing but i want to make sure that that you agree that to our new early and career people who are watching that's just one person's opinion right could be different ways i mean i'm sure you can think of different ways to go yeah i mean ultimately everything is is a trade-off so and and then once you've decided what set of trade-offs you want to make then you can narrow it down to i could then implement it whatever amount of waste but still have those trade-offs be be what they are and then optimize or whatever it might be after that but you know the trade-off of doing this rather than doing this in reaction to an event or in real time like the query comes in and then you fan out which is like the easiest way to think about it obviously right i don't need any other infrastructure the request comes in and my code runs because i already have code that runs per request but then you're like well that's too much i only need the data to be as fresh as you know once a week but then you start then you've got to deal with the other side of the problem which is what's running this code now because it's not my request anymore right um and does that ever need to be invalidated and so you know cash invalidation effectively becomes one of the hardest problems in computer science is now something you have to deal with and so that's kind of for me that's like the upfront binary trade-off that you have to make now it's pretty clear like because as you said there's probably 100 ways we could have it update every week so that seems like clearly the right way to go right but we need to solve this like well who's going to produce this file is it done as part of the build and deployment process is it done via some batch that's run by some other mechanism like if it's is it done at the startup of mine is it startup right exactly so my web app could do this there's a thing in asp.net background worker service i could have running where the site itself on startup could cache it locally and save it on local storage sitting right not even azure storage but on disk start up and say oh did i know about this i will take the 10 minutes or the five minutes or whatever i'll make 50 calls i'll get the stuff i'll save the thing and then i every end hours days whatever i'll update it that's fine so and then and then if we wanted to i could i could enumerate the types of things that i think could go wrong with that so does your web app have right access to the folder that it's running from right am i running in a container right are you in a container what have when will it will it ever get overridden or overwritten or deleted as part of normal processors maybe if the machinery starts if the container restarts what happens or is it being read is it being read two about being simultaneously written to are they currently official currency issues that's a great point and answering all those answers or like going through the thought experiments for each one of those will help you further refine well okay maybe the answer is yes to one of those but i can circumvent that by doing this and now we've got something waterproof um and still relatively simple because we've only really gone what two branches down the thought process tree so far and we've got something that that meets what we want um and and relatively simple is another thing because one person's relatively simple maybe their skill level another person's might not so for example uh issues of concurrency and and asynchrony that's something that you're better at than i am so if i were to do a naive implementation of this i would probably write a couple of nested for loops i would describe a projection or a view model of my idealized system i'd make a list of azure friday shows and i would make these calls completely synchronously one at a time deserialize them and then go left hand right hand i'd go here's the thing from the docs api make an azure friday show put it in a list and i would do that knowing however that the correct way is to do some naive parallelism where i'd say send out 30 threads simultaneously to go and do this and then wait the use of the word correct there is still somewhat bothersome because correct because i feel less than because i don't want to do it into it but it all comes back to trade-offs and so by choosing for example to introduce parallelism you're going to use more resources of some sort while doing so maybe you don't want to burn through that many outgoing requests from wherever you're hosting this or cpu cycles or whatever it might be or i don't know like i just i may have had a malformed url a moment ago when we saw me get blocked or i could have upset an ingress somewhere that's right triggered something right limiting rate limiting throttling we don't know if i decided to go from from hello to 50 gets all at once do i need to stagger them are they going to start to block my ip address these questions and correct me if i'm wrong but these questions are the difference between later in career and early in career engineers and the only way you do it is by suffering well i have yeah i've been rate limited before that's well that's funny about it when you say the only way of doing it only way of kind of getting to that point is suffering you know people often say that you know learning is all about failing and then trying again and it's very hard to learn and grow if you don't fail you don't learn anything from success you learn from failure and that takes time that is simply a factor of time and then what you do with that time and so if you are constantly failing and you do it for 20 years then you've and you can you know you can retain at least some knowledge about you can remember i remember i did a thing once and it caused this type of issue that's as specific it has to be you don't have to know what api you called you know there's docs for all that stuff right or other people you can ask but if you can at least retain the failure modes that you've encountered over time it will become easier to do this type of this type of you know thinking while designing or feeling out a potential solution for something so what do you do if you're not you know one of us who's got some time in the business and are also kind of you know dudes of some renown and you do a youtube about this problem and all the comments are like you poser you didn't use this library you poser you didn't do asynchrony right i'm hanselman maybe i can get away with it but somebody else might then bit discourage how many negative comments about you suck that was the wrong way yeah chase you out of tech and you just go like fine i guess this is not for me i must be doing it wrong i know in my heart of hearts that i wish i could write perfect asynchronous thread safe intelligent code that fanned out and used the resources properly i know i could do it but i know i could write naive code quicker right but if i did that someone might judge me and they might say that's amateurish that's not production ready this this is this is kind of one of the challenges with putting content out there like you're talking about is that it doesn't allow for the type of iteration that is that really is what we all rely on to get to our ultimate solution which is you know correct for the set of trade-offs that we currently have made and the conditions that we are currently experienced something that is correct today can be wrong next week when you run a promotion and now you've got 10 times as much traffic to your site it worked perfectly well up until that point and you made the right set of trade-offs and to fix it might require a complete re-architecture it might require introducing something more complicated like you're talking about and in fact what you said before about writing you know going back to hey in this case let's write something to disk that introduces issues of concurrency with regards to scale out like what if you want to use one of your cloud providers features to like go from one instance to two instances to handle traffic well which one is writing are they both making 50 requests each and then saving to their own disk or you're going to somehow move that to someone else's um uh sort of problem domain now and then have those two nodes read from that so so an a naive and when i say naive that's not a negative word it is simple it is a simplistic reasonable way of doing things could say well if i'm going to fan out to two that's fine if i'm going to fan out to 200 i probably don't want 200 nodes making 50 crisis at the same time but then someone might say you need redis do you do you so it's interesting like what you just said then even there are there's almost step functions right where you can introduce or think about uh will going from one to two cause a problem so doing things with one cpu with one instance of your app where you've got infinite time to have the app boot up yeah basically anything goes right then you go well i don't have infinite time i want the app to boot up in less than 30 seconds and have everything okay you've introduced a restriction let's go for what let's see what changes as a result of that well the obvious next one like you know maybe we've got two cpus now instead of one so we've got a bit of real parallelism but once you go to two instances of the app from one that changes a bunch of things um and so that's there's a pretty decent step function to think about but as you just said what about going to 200 well going from two to five probably won't change much yeah yeah or two to ten but this whole orders of magnitude way of thinking what about two to twenty yeah well now it starts to feel a bit icky because because of what you just said i'm hitting it with 20 times as much traffic right in the bad situation for example i had to reboot my web farm all of a sudden right um well a senior engineer would say from two to two hundred that's a great problem to have yeah do you have that problem do you have that problem i do not right so again the pragmatist the practical perspective would be well put have one of them do the job or have a separate sidecar process do the job save it in storage right and then pick it up and i love this this acronym here yagni right yep you aren't going to need it like it's a great problem to have do you really have that problem and i think that that point of like how could someone get that thing done so quickly because they limited their scope they limited their scope and there are dozens if not hundreds of ways to solve this in any different language in any different way but quick and dirty right what do they say you can have it good fast or cheap pick two there's the there's the you know it's the classic triangle and you pick two so we've only got a little bit of time let's see what kind of trouble we can get into is that okay with you all right so we determined i believe before that uh we can go and get the episodes with this right here so we're just doing this in visual studio and we've got a.net six console app and we've got a um we've got a we've got a url template here that is going to be like what is that we'll call it whatever it doesn't really matter um we can go and say this is actually kind of cool by the way if you spend a lot of time with the intellicode stuff it's kind of freaks me out yeah it's on in mine and like when i'm coding it does it suggest some people make suggestions and i i have mixed feelings about them some of them they're like yeah convenient i was making a uri and you guessed that probably because i put the word url you know in this but then i look at this this this url right here and i realize that that's the thing that's going to change yep right that's the that's the thing i'm used to doing stuff like that yeah i don't know if that's the right way to do it and then put a uh what is it like this yep so that's an interpolated string oh why it turn turn color there already it's mad at me it's not a real url maybe i would keep it as a string for now right and then make it a url when i have a proper one i don't know why our syntax highlighting is salty but that's still a thing right i'm going to change that at some point and then we had our our batch i think we felt pretty good about that batch didn't we and that was going to be just comma separated entries you know and i don't know if it's going to be zero or we're going to you know how we're going to concatenate these things who knows right who knows all right so sometimes people in dot net have said that it's you know historically been a little bit confusing about which http client to use depends on how long you've been in it maybe now it's easier i would honestly at this point go in google and i might use http client but maybe there's a new one what's the right way to do a simple get hd http yeah hp client's still the default answer where it gets uh a little more complicated is if you've got an existing code base that you know was around since before http client existed and or if you are running in a server application where you have to manage like managing the lifetime of http client object instances yeah actually becomes something you have to worry about because one they might be expensive two you have to worry about things like dns caching and reuse and so we have some higher level types that you can use to help manage those things for you okay we i typed http client you notice that it says system.net.http it's automatically turned green indicating that that namespace is in scope that's because we have global usings or global namespaces you may be familiar with net applications that have a whole bunch of usings at the top and you type http client you'd hit control dot and then it would add a using statement but there's a bunch of defaults and certainly system.net.http is available to us by default so i'm presuming that's why i got success early on right yeah i hit enter and our intellicode suggests a base address right which is interesting and it even goes so far as to take one of our existing urls right and and bring that in which is pretty darned interesting which then implies i could go and format this in a couple of different ways right i could do um in the old days i would do something like was it string dot format but how would i do that these days like here so in this case the base address it would typically be like the server just the server address so it would be just.microsoft.com so if you think back to when we were looking at the document being returned the url that we give you for each episode did not include that part yeah it's at that point that's very common because you might be calling this api in different environments or indeed the code doesn't know necessarily where it's running right it's just it's running and you got to it from docs.microsoft.com because that's production but it doesn't know that that was terminated at some server four layers ahead of where the app is running and then the only way it would know that is to investigate or sniff the incoming url um so typically it's yeah you you you choose your base and then you go from there so in this case it's it could might be staging it might be dev it could be local host if we were to practice things as well and then we can keep off the the slash or not it depends on what we're using to put our urls together but the implication will be that a slash will be added when i if you use the right api to join it if i use the right api so something to think about always think about your slashes are you going to end up with not enough or too many just be conscious of that right okay cool remember what we do with our client we've got gets we've got get async we've got get string we've got stuff like this that we can just sew and we used to have back in the day just gets but now everything's async these days and i know i want to get a string i can just get used to that yeah um for hp client particular there is a synchronous version available i don't know much about it because i don't use it but there is a synchronous version available um they get these methods here are basically convenience methods that hang off http client that effectively internally are calling a bunch of other methods on your behalf to set up the request populate the request body if it needs to be the headers send the request then and then wait for a response to come back read the response headers and then finally read the content of the response body in whatever form is is is required for the method you called in the case of get stream it would give you back a stream in the case of get string it's just going to give you back the raw contents as a string so now when i'm going to go and call this i'm going to go and ask something like this this is the structure i've got this document that includes total count an array of episodes where an episode includes a number of things some of which i can probably ignore don't really care about a couple of other that i want do i need to make an object that is shaped like this and how correctly shaped like this does it mean so this is where you could go a few different ways right you could decide to basically work in raw json and do everything in json without um creating types to recognize you say work and raw json like it's a document object model and pull it out loosely typed yes i mean you could be really raw and just work with it as a string oh no i would i would prefer not to look so that's a great point actually let's let's talk about that for a second you're initially saying get and then it's like forget what get array async right that's the lowest level unlikely that you want to do that if you're thinking about a high level problem you might think get a string but then is it really my job to be a json parser right now right depends on the complexity of the json right if it's just one value you want to get yeah maybe you could do that but there's all kinds of things that get handled for you in encoding and and date times and things that can be made smarter if you decide are you working at the byte level the string level the json level the object level do you just want the serialized objects to pop out the other end right and then there's the question of is this streaming or is it um non-streaming i guess is it is it a single atomic sort of operation where you you right send the request and then you get back you don't get anything until you get everything right this is a non-streaming operation meaning i do a get and i get back something and it finishes right which means that you're gonna have that everything that comes back has to be you know stored in memory so it can be given to you as a the return you know the result of the method that you're going to call and you can stick it in a variable if you're dealing with apis that return very large things sometimes they'll support streaming and so you'll actually write your code in a fairly different manner where you might give it a method that gets called over and over uh as new results come back from the initial call in this case we don't have to worry about that uh it's the simplest thing we could do is just get string async give it a method like we were doing in the browser before a url i should say and you know verify that we're actually getting back the body that we that we were seeing before now now um in this case this is an interpolated string i would oh yeah so that interpolation yeah what is it that you're attempting to do there exactly well so in the old days i would go like this right and then i would pass in page number right but in the world of interpolated strings i'll make a copy of this page of this section here and i'll say what if this was hey this is a string where that's an actual value right if i had a value like int page number equals zero and i want this to be page num i might want to feel like i want to use the modern way to do things and i'm not sure how to format that so you're running into it so what you have done up top is you've used the old ordinal style format but you've declared it as a as a interpreted string now this you can't pass page number there because it has to be a valid symbol right and it is down here which means i can move it up right but i don't know when this evaluation happens it happens at that time that the string is created yeah which is not the time i want right so then maybe this is not what i need right so i am it is appropriate to use the older style string format in this case yeah the other way you could do this is because you know that it's a collection of things being passed to the query string you could have a query string collection effectively where you just set the value of the one that you want every time and then ask the uri class to to produce you the output every time but i don't feel good ways to do something yeah i don't feel bad about that i just don't feel modern about that you know what i mean and these are things where doing string.format doing string that format makes me feel a little old-school yeah so maybe so like you need to remove the dollar sign from the beginning of your string literally this is no longer an interpreted string it's just a string that it's a string that happens to have something in the placeholder a placeholder and this replaces the placeholder or ordinally in order right cool and this is unused and that's why that's gotta that's why they're mad at me right there and i know that i could right click and run this but i also like to go and say do things from the terminal myself so i'm just gonna go and say cd here and then we can probably.net watch this at some point can we there we go look at that so there's some jason right there at the at the command line and at this point i might want to use a json um prettier right and run that through i think what is it called jq or you know one of the json uh prettiers at the um command line have you ever used those json beautifiers nice so i know you're not as much a uh uh a uh command line person as i am i'm kind of obsessed with the uh the command line when i do i tend to use powershell and so i rely oh yeah the functions in there right so i could go and and make a call dotnet run that's gonna spit out a bunch of json then i could gotten it run and then paste it into i think is it is it format json i'm off i would be off googling for this stuff right and figuring out like so here i am now we're googling at scale uh jason beautifier powershell pretty jet pretty fine jason in powershell convert to json convert from jason yeah that's basically what they're doing there's a thing called jq that's actually better for this which is just a linux based pretty tricky fire all this is going to do though is chop it up like this so we we stuck it in and then spit it back out and that it didn't color it but at least it told us that we got back what we wanted so that's cool okay so we return back here and again i could do i could do this i could i could actually uh if i wanted to do like a read line this was a problem i was trying to solve it you know for a long period of time and i could say well true yeah and then we could do a a hot reload or something and then just keep making changes and iterate more quickly uh we'd have to move it outside of the main distance so yeah put it into it it would work but you get the idea but that worked right off the bat but we're returning a string we haven't got any objects to your point so then do we do we want to deal with raw jason or do we want to do something else because right now we really don't need a lot like i don't necessarily feel that i need to model a bunch of stuff there was a thing back in the xml world called xpath i know that there are j jason equivalents where you can just pluck out the two little nuggets that you need yep what do you think i need title i need url i'm going to need entry id and description for later so if you had j passes four things theory i've never used jpath jsonpath but it it's it's the spiritual equivalent of xml path xslt type stuff as if i understand it and so you can theory just you know just use a json to serializer that supports json path which system text json doesn't yet they're looking at uh uh for dyna7 right but json.net does and there is uh yeah there is a system.txt.json.node there's a thing right so we could try that that would allow you to work with this as a like you said before as a structured document effectively so rather than being a string you would create a json document from this string um and then you would move around in it using that and that's probably the easiest thing to understand you don't have to learn a new syntax which you know jsonpath would be you just move around inside of it logically based on the structure of it let's try it not knowing what we're doing because it's it's fun to try those things and we only have a little bit of time so json.nodes node that's what it's called why am i not getting my that type name you just want the namespace oh that's a that's a type name no i think it's a you sure 99 sure it's not well the compiler agrees with me sorry yeah touche would have sworn that was a thing let's find out what nodes ah there you go plural there that's right so we were both right do you actually need that namespace just to get the document though looks like it's not necessary because it's uh it's available to everybody so that's already not meant not needed so that's kind of cool so then what we could potentially do then is say var you know json object equals json node dot that's saying it isn't oh that's it wasn't necessary because i hadn't needed it yet you hadn't used it yet that's right i hadn't used it yet so then we'll pull out uh we'll pull out foo i'll move that down here okay okay and then now we've got this json object that's been pulled out of this string so can i ask you why you wouldn't just start with json document rather than diving straight into json node yeah i would probably say ignorance on my part okay so there is a json document class which is in the root name space system text json namespace rather than the nodes namespace and i don't know i wouldn't just offer it yet okay what does that offer me well that's like xml document so cast your mind back 15 20 years you would start saying i have a document that is json which is the which is the response that you got right and that is different than a node how it's not necessarily a node represents anything in the json tree anything at all right so a json node would be a a property name is a valid json node well that might only be a symbol but it depends on how they've structured their compiler and then right right right syntax tree okay well one idea if i were alone to learn go and learn this is i'd put a break point on here yeah and i'd hit f5 and we'll drop in a debugger real quick here and i'll just hover over them and i'll see which one feels friendlier right oops i need one more line to make sure that i don't actually finish because this thing just stopped on that last line so we'll hit f5 again okay so now i have two objects i've got a json document right here oops i'm trying to zoom in let's try this again there we go looks like it's got a root element and i can see the json inside of it i'm not seeing a lot of i didn't i'm not seeing the dom the way i would expect to like wander around inside of that dom that might be my own ignorance speaking while a json object should be adjacent object which is a json node that looks immediately friendly to me interesting is there a particular thing that you think that we would do on jason document that would be more friendly than mine yeah it is literally just what i would have started with like i'm trying to click around in it now like typically the document apis you'll get a document structure you'll get the root element um property which is where you would navigate from um but i didn't see what it was giving you underneath the element element will usually have some type whatever the root element was let's let's let's humor me for a second because uh why not right and i'll just put them next to each other and we'll say uh jason object and i'll say um episodes no excuse me that was already parsed json object at zero no that doesn't feel right did it add episodes at zero you you're in there too by the way yeah i might honestly my machine is having a whole bunch of hard time right now for some reason that's okay but i'm watching yours jason object at two the first one gives me episodes which then has a json array so right now we see that there's the json count so we can go and say int total count or you know again there's always the whole religious argument about that versus var um at zero and then there's var and actually probably more like four each right look at that and then tell it and tell a what's it not funny you know where item is probably going to be another that'll be a json array we'll have to tell it more information and that'll probably be another json object underneath there and then at this point i might want to just spit it out one at a time i'm just kind of like getting a sense of what's going on here yeah feeling weird about this because it's not null here uh they don't have an they don't have the ability to enumerate but i know that there's an array here it is in fact a json array i have to say as array in order to do that there we go okay let's hit f5 so i pull out my json object we came back json object at zero must be just called total count which means i need to ask for it by name i can't just pull it out and assume that it's going to get what i want it looks like so when i want to ask something of it it sounds like i need to be a little bit more specific using this api uh no well okay what i saw looked much more like the exception assistant was showing you the exception on the wrong line oh really yeah like because the the exception message you got was that the type of the thing that you're trying to pull out must be of array which i would have thought there's not going to be line 21 that's the problem it would be line 22 where you're trying to get it as an array let's find out but we'll see what happens right node must be of type json array so json object right here at zero is what is that you'll need to use the um the locals window i don't think you can see it in yeah okay so we'll come down here oops i always hate that now i've now i've angered the beast it's like which one of those six boxes does it go yeah yeah exactly so we've got jason object right here at zero it's nodes it's another json object right so it's a hash table of hash tables and then it has zero zero this this feels maybe a little bit more loosely typed than i'm used to well this is big and this length goes back to my initial sort of surprise at using nodes because in my mind like walking this using nodes it means you're literally going node by node by node in the json um i was hoping i'd be able to do something you know friendlier and i might still be people are probably watching and saying you know your dummies you know like episode count or zero at episode count and i move on with my life you know what i mean yeah and that's i want that depends on the api right depends on the api um and you know i'm i i google around a little bit here and i can see comments on you know what's new for like here we go here's a here's an article at c sharp corner that talks about is a node which is that abstract class just an object which does allow you to like both create and pull things out and then i was thinking about doing stuff what i thought would be friendlier like this yep you know which seems which seems somewhat friendly to me have you tried that yet well i'm trying and failing because i'm not i'm not holding in my mind as people are probably yelling at us that we have this json object that's actually not an array right there that's total count and i'm not thinking so that should literally just be total count and the next one will fail i'll just let it fail i'm also counting on this thing to box this up and tell me whatever type it's going to be let's find out hit f5 parse parse total count so that's there you go i was overthinking it yeah that came out automatically json object figured it out and total count now here's the funny thing though total count looks like an int not an int what do you mean it looks like an int well it looks like an in because i'm saying it looks simplistically you hover over it and you're like look looks like it has a number it looks like a simple type but because i was being sloppy with var i'm saying yeah i don't know you figure it out right tell me what you think it's going to be and it did and it came out as another node right so then you have just to be clear though just to be clear when you say you figure it out there's nothing to do with runtime like that's totally what's underneath i can and it's clearly telling me that so then i can go and say well what does it convert to int or i got to go and pull out the dot path of the thing you know i just want to get the value of the thing as value let's see if that gives me an inch that would be a method right there's a method what is it what type does it return the derive the derived json value type then it's going to be adjacent value json value right which is another thing i got to go and dig into right and then figure out you know it could be a nullable int or whatever right well it's going to be something right it's going to be an end and now i have no idea now i have a property that's the thing you have to like it it's an end now it's in it now it's an actual int yep yep well you cast it that's why like i i forced it i was like you're going to be in it because i told you so and well and even then it's not it's not that straightforward right it's there must the value has to be have been one that the runtime is allowing you to cast it to an int which means that there's either an implicit operator um or something like that yeah right and there's two runtimes to think about here there's what jason is thinking about as it's parsing there's the clr itself and then there's what's happening here where we we have no idea the var in the compiler has no idea what's going to come out it says json value so unless i explicitly go like that and force it i don't know if that won't cause a runtime problem it might very well cause a run-time issue right yeah right let's find out it probably won't because you've done it i won't it won't because it is an end and it's then well it's it wasn't it at the moment it is now in it it doesn't mean that it was an end i mean this is a subtle no this is a great point and let's let's let's fight about that for a second because you're abs you're absolutely right it is a value that is easily coercible into an int at runtime and i was given grace it's not coercable because the compiler didn't do anything so oh okay let's talk about the word coercion yeah so like typically when we talk about these uh my understanding anyway is that coercion is something that's going to happen in the when we talk about c-sharp and net is the compiler may be able to query something right right at runtime runtime happening yeah at runtime someone is someone in the run time has to give you an int back like an object that is being stored in the runtime um of type int and there anything can be turned into an int if it has an implicit operator well and here's the deal so if you look at this surprising thing this is a list of operator overloads explicit operator overlays there they are this is a defined explicit conversion to the end of a given json node so i've got a json node i said give me the value underneath it the the held value and and effectively call the explicit operator overload did you just flip yourself too what was that you just like yeah like i said my it's tearing it's doing all types of stuff okay cool so that gives me total count so this is actually not horrible i don't hate this but we want to do we do want to point out that there are lots of ways that we could have potentially done that right so then we've got json object at one which here is episodes right um i suppose i could probably ask for it by name episodes right as array in this case it's going to end up being an adjacent array then the question is is is a json item and json object going to allow me to enumerate through that is this thing you know um implement i enumerable let's hit f5 and see what what happens when we go through here we're going to just f10 now we have an item of 12. look at that this is actually kind of clean it did not to be clear we're not de-serializing i'm trying to zoom in and capture that moment at the same time so forgive me this is a flaw in the zooming tool there we go so what we got here is a json object which looks like an array but it's actually kind of a hash table a thing so i can ask for it by ordinal value or by name and this is going to make it really convenient for me to go and say give me the title give me the url go go go right but correct me if i'm wrong i could have made an a a c sharp object that expressed as a projection of that roughly this deserialized it if you know that the shape of what you want to get out is just a subset of these things i think it would be far simpler at this point to simply declare that type and then just use the top level to serialize um method to say please deserialize this json blob into will the top-level decearalized thing ignore the stuff i don't care about yeah okay it should if it doesn't i'm wrong but that would be really strange that you can't it would it would be it would fail because it has a value that it can't put in because there's no spot for it is this an opportunity to use a record yeah does jason should be serializer be serialized into records i i don't i don't see why it wouldn't the only thing that would be an issue potentially is wait for you things like primary constructors and stuff but i believe that would all be supported i'd be very surprised if they didn't implement that cool so you know records are kind of like you know how would you describe them they're not structured they're just kind of like friendlier easier well there are records records are a language feature they're not a type system feature a runtime type system versus the languages representation of those types but there are now record structs in c-sharp 10. but before that records were more like classes okay so if we say public record show and then we say uh you know public oops are you in there or does that mean making a mistake that's not me sorry public record show and then you know i usually go like prop tab tab and i think about like okay what do i need i'm looking in i'm in a break point here i'm writing in the debugger because the benefits of that is i can see what's going on down here in the bottom that's why i'm writing this in the beggar so here's the things that i need right so what do i need i need the title now i can say set and init which is only let me you know set it once and then i'll go and say i need the url i need the title i need the entry id i need a description right url description i'll do it for the purposes of hopefully simpler deserialization i will uh give them the exact same names yeah that way i don't have to do any mapping right that's all i really want i don't need more so i'm hoping that that it will pick those out that's what we're assuming here right that we're going to be able to deserialize so here's our example of getting a string so now i have an item how do i say item as object so what i would do personally is give me my thing not do it as jason object or jason node at all at this point i would create what's that it's too late man no what's not like you can create so you've got your show type but you can have a type above that or you can just ask it for a list of shows and it'll find the member so go back to your json what is the what is the member in the json payload that episodes right uh yeah well it's it's sitting right now in items like i've already got it there oh i need to go okay so my point is i have a json object as as fragment right that's a fragment okay i have a i have a piece of it i have i have a note i want to be serialized this here so what i no i've never deserialized a node you could either try the json serializer stuff there's actually a json serializer class static class which has a bunch of helper methods on it and it takes yes that's what you want it takes json node so json serializer dot to serialize and there'll be a generic version of that is my is my guess yeah so deserialize as you should yep and then you give it your item and then we say item yep and then we save our show i know that there's people who are screaming who are like old heads that are like var var var why all the vars rust typing for me don't care don't care uh items you need to put your record declaration last is that is that what it's complaining about no that's that was done way way up at the top what's the actual error i mean uh they're mad at me because top level statements must precede namespace and type declarations right so you need to put your oh you're right yeah i'm sorry grandfather you need to put your type declaration at the bottom that's me being a c person and thinking that c you know put all the stuff at the bottom put all the stuff at the top rather than the bottom okay because this is a top level application you want to really have your using and then there's your first line as you can see and i was randomly declaring a class in the middle somewhere okay so let's go ahead and hit f5 i know you have a hard stop so we might have to make a part two here yeah so let's hit f10 okay let's what does the shy give us look at that beautiful oops let me try to zoom in on that again i gotta capture it so i'm gonna hit that ah they're not gonna let me do it there you go so that be serializer took that json object so we just switched from kind of a loosely typed kind of a laterally typed thing and then suddenly decided to like nail it down and now we got a real object we have a projection of just the couple things we wanted we asked the serializer to do all the hard work for us that's basically what we did right we said we have a type that we've defined which has a shape and then we have some json in this case it was a json node because we pulled it out of a larger json document and then we said can you fit this blob of json into this an instance of this type please and then it did its defaults it went okay well you've got a property called whatever that is of net type whatever else can i fit you know is there a json property or member of that name and does it have a value that i can turn into the type that it needs to be for your net value so let's make sure that people understand the reason that we didn't ju why didn't you just do that from the beginning well my perception was my analysis was that making an object that was shaped the way i wanted it was going to be more hassle so what i did is i used loosely typed more loosely type things to dig down to this moment this this highlighted purple area here and then i then i went to the serializer yes you could certainly have described something that was shaped more correctly right and skipped right that and that and digging here but it also showcases the flexibility of what's available to us right absolutely and so at this point we could in theory create a new type that represents the final result well and honestly we could do that or we can shows pretty darn close i might just make a list of those you could but i mean even if you want the uh what's the what were you pulling out before episode count or something is that good total count total count is uh is there right line 16 so you could you could remove digging into the json object just by having a type that has is is the shape of the response from the api yeah so that could be that could be less magic string and i could just pull that right out as well yeah yeah you're starting to get a little shaky here too and your your audio and your video are out of sync so i think we'll probably call this a show but let's let's hit f5 and make sure that we can actually do this thing completely so i'm going to just hop back out to my my command line here that's actually going to fail because it's not really json so that was a mistake i've angered the beast here we go gotten it run and look at that so that is calling that api and outputting just titles yep so that's proved that i can do that so then now i need to maybe next episode you and i will then fan out and we'll do it either synchronously which is easy put a for loop around it and go as many pages as you need to or identify the number of pages and send out 30 or 5 at a time or whatever in a parallel event and then we'll return back a list of shows that have the things we want we'll call our batch in a similar way and then save that to storage and we'll be done but hopefully this has provided some value to some folks so they understand the kinds of thinking that you would apply when solving these things and even the mixing and matching of styles like it would have worked if we'd done it your json document way it would have worked if we'd done it all json deserializer way but doing a mix of it doesn't make for nest you know nasty looking code i don't know some people probably don't like that inner intermediate object there you know a lot of this goes through i mean as i said tidying up it can be done iteration is what the name is all is the name of the game right when doing code no one is expecting people to perfectly code the right result the first time they sit down at the keyboard for any given problem it is all about iteration and you know in this case where there's two people doing the iteration and we're learning a few apis we might have a a bunch of foundational knowledge but i haven't ever actually used this specific api in net because i always use the higher level stuff for the types of stuff i'm doing but i understand the concepts like you do and so we go and look at the docs we explore we use the debugger and we figure it out as we go yep exactly and in some cases of course we see that there are squiggly lines that are telling us that like you could be this could be null you know it is absolutely true that i could be pulling out a thing that doesn't exist and i wouldn't be able to tell you until runtime so this is not yet defensive uh this defensive code by any means so that's another thing to think about cool well thanks for your time sir appreciate it thank you all right so if you've enjoyed this kind of stuff if you want to see more coding whether it be with me and damien or me and david fowler or maria or any of our friends on the.net team leave some comments and we'll get you some more content like this and we'll see you again another time
Info
Channel: Scott Hanselman
Views: 16,465
Rating: undefined out of 5
Keywords:
Id: dqNLBJKpNAI
Channel Id: undefined
Length: 67min 39sec (4059 seconds)
Published: Thu Dec 02 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.