Program your next server in Go

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
so right all right folks now we're going to start up again shift gears a little bit I am old enough to have started programming an 8080 assembler made my way through C and C++ live in c-sharp mostly and if you twist my arm I'll admit that I do JavaScript as well although mostly reluctantly so I am most interested to hear a similar talk to us about go oh I'm Samaras money I managed to go programming language team at Google I didn't start off as a manager on the team I joined as an individual contributor programmer about four years ago and then slowly took on management duties by the time I was developing the libraries and infrastructure needed to use go inside Google so the language is stabilized at that point we really want to make it useful for building production systems internally my background at Google is in distributed systems so go I found to be remarkably interesting for building those kinds of systems and then just recently this February took over management of the whole team so I've been trying to wrap my head around the broader problems and challenges around making languages useful for the great diversity of things people trying to do I'm going to talk to you today primarily about server programming that is sort of where go came from and hopefully get you interested enough to program your next server and go here's the outline for the talk we'll talk a bit about what go is where it came from the history and who's using it today both inside and outside of Google I'll spend a few slides comparing go to other languages just to give you a sense of where it falls in in the space and I'm gonna spend the bulk of the talk presenting code this is a very code heavy talk not going to be writing the code but I'll walk you through it and show you some example executions and I'm gonna be sending a server a web front-end with sort of an interesting distributed search back-end and that's going to demonstrate some of those concurrency primitives in an interesting way and I'll conclude by giving you getting tips for how you can get started with go in your organization so what does go well according to golang.org it's an open source programming language that makes it easy to build simple reliable and efficient software that looks great on the label right what you need to know here is first of all it's open source and free second of all simplicity was a goal from day one and everyone says simplicity I'm going to try and make the case through both the code examples and discussing what is in the language and what's not in the language that we really worked hard to try and achieve something new here in terms of reliability and efficiency goes a safe language in its design and it's designed to produce fast programs and also compiles fast which is great for your developers and so I'm hopefully to show you throughout this talk how we achieve the goal stated here goes design began back in 2007 as legend has it Robert Griesa me Rob Pike and Ken Thompson were sitting around waiting for a large C++ program to compile and so they had some time to think about everything that was wrong with the programming languages Google was using at the time they were joined shortly thereafter by Ian Lance Taylor and Russ Cox go became open-source in 2009 and has since built a very active open-source community the language stabilized in 2012 that's when we call it go one and since then both the language and the standard libraries are backwards compatible so programs you wrote in 2012 compile and run today and we're going to be releasing go 1.7 this August so why go in the first place why we're you know ken and Robert and Rob sitting around griping over the languages what problems the day for see what were they trying to solve fundamentally goes on answer the problems of scale at Google here's a picture of a Google Data Center so you know obviously Google comes from a background of massive programs that run in lots of machines and also massive teams that are trying to get big jobs done so they're really two problems of scale the first one is system scale designing programs to scale to thousands or even millions of machines it's common to Google that jobs are running on thousands of machines and are communicating typically via RPC that furthermore disparate jobs need to coordinate and interact with others in the system what this means is that there's a lot going on at once in our software and the fundamental way that we help our users get a grasp on that is but by building in support for concurrency and to go and making it very easy to employ that to structure your program in a way to make to make sense of this madness and to deal with things that are happening concurrently things that deal with failures and things that are taking different amounts of time to execute I'll go deep into this in the example and show you this kind of power you get from building this into the language the second kind of problem is engineering scale supporting large teams building large code bases iterating quickly from five years ago these are old numbers we had 5,000 dollar developers across 40 office offices lots of changes to the code base lots of turnover in the code base lots of test cases executing we build out of a single giant code tree and we run all the affected tests with any important change and so we needed a language and a build and test and deploy model that scaled this scaled with this that helped us deal with the problems that we're dealing with in Java and C++ in particular as they scaled on this axis as well so goes try to solve both of these problems actually before I go on one thing that's interesting here is that while we were trying to solve Google's problems both of these problems have turned out to be not just our problems not just other enterprises but people deploying software to cloud or dealing with these kinds of large-scale systems and very interactive concurrent systems and it's not just true at large scale to even try to locally that if you're dealing with a mobile device there's a lot going on at once there's a lot happening in the background as well as user interactions as well as interacting with other services and notifications so this notion of concurrency and giving the programmer a handle on that is critically important to a lot of kind of software and similarly engineering scale well this is bread and butter this you know we're all dealing with a lot of things changing and we need languages at scale with that so who uses go at Google hundreds of projects thousands of programmers we have millions of lines of and it's growing quite rapidly I'll go through a few public examples one is flywheel which is a speedy proxy for Chrome this is particularly important for mobile devices that have bandwidth constrained say in developing countries that have limited networking and so this is doing on-the-fly compression and serving of webpages to mobile devices next is DL Google com this is going to serve all of your chrome and Android and Earth downloads Vitesse this is a big one this was actually our oldest production systems written and go it's YouTube's my sequel data database balancer this has been serving all of YouTube's database traffic since 2011 so go has been you know I won't just say its production ready now it's been in production ready for a while and we use it in a big way at Google there's a couple more on here lingo we moved all of our logs analysis from Saul's alt ago so Saul Saul was an older Google developed language specific to logs analysis and goes a general-purpose language that gives people writing logs analysis more power and more ability to do proper unit testing and things like that and so go succeeded in a completely different domain as well so while our top target here is network servers go as a general purpose language you can use it for apps you can use it for tools Dockers one of our big external adopters and they're largely building command-line tools and frameworks like that so let's look at some other external adapters not going to read this list but just give you a sensitive flavor here we've got hundreds if not thousands of external companies using go a lot of them are startups startups as a distinct advantage of not having a large amount of legacy code to integrate with so they tend to try new languages quickly but we are seeing adoption enterprises as well again if you're developing tools or servers standalone binaries that can communicate with the rest of your system that's a good opportunity to try a new language and so we've seen a number of companies trying it out and our community is quite enthusiastic this bus at the bottom is through is from core OS and that giant figure on its side is the Go gopher which is a language mascot so they painted a bus with our gopher they liked it that much the Gopher con conference in Denver sold out the last few years with twenty five hundred attendees per year so now there's I think it doesn't such go for concert around the world it's a lot of fun so let's talk about how go is similar and different from the languages you know already this slide is from a talk by Brad Fitzpatrick back in 2014 he title would go 90 percent perfect 100 percent of the time and he had he tries he's only go team he's tried a variety of languages he developed live journal and came to Google on these two axes are so the what I realize now a traditional debates between developer productivity and and program performance so on the y-axis as you go up you get languages that tend to be more fun that tend to support more rapid prototyping you get the dynamic languages Ruby Python Perl and JavaScript on the x-axis high and the x-axis you find C and C++ these are what people still turn to to get the most performance out of their programs to get the most closest to the metal and control that at the lowest level in Java you can see is sort of making a trade-off in this space it's fairly efficient and it provides a lot more safety than and developer support then then C and C++ 2 and JavaScript as well you can see is sort of coming across on the efficiency accesses has been quite optimized our claim is that go finds a new point in this space probably comparable in performance to Java maybe a little better once you start employing some concurrency but we believe it's a lot more a lot higher in that developer productivity access and let's talk a little bit about why so I'm going to compare go to Java for a few slides so first of all go is a lot in common with Java if you know Java or other Java languages like it go will be very familiar so that's going to lower your learning curve with actually goes pretty easy to learn if you're coming from a C or Java or related background so goes in the C family it's an imperative procedural programming language curly brace complete statically typed compiled it's a garbage collected language we have over last few years optimizer garbage collection collection for low latency our focus has been on service but we are also now doing more and more work to optimize for throughput workloads as well gos memory safe you could have null references you can but you know nil references and runtime and bounce checks are checked at runtime variables are initialized to zero this is all bread and butter you know you know this coming from these other languages terms of program model goal will give you methods and interfaces and runtime type information so you can do type assertions and reflection so let's talk about differences go differs from Java this is again comparing to Java not the other languages in several ways on the sort of fast efficient for computers access go programs compile the machine code there's no VM and this means that we can optimize for the architecture I moved my bolts around here we also produce statically linked binaries which means you can compile your program copy to the machine you want and run it you know pushing a bunch of dependencies around this also means you can cross compile for the arc destination architectures you can build a Windows binary and from your Mac and copy it over there and run it this makes deployment a breeze go also gives you control at the language level over a memory layout in your structures and your arrays this means that use your program it's less time chasing pointers you have the ability to control the allocation patterns in your program if you're in directions this gives you an access of control that you don't get some other languages in terms of developer productivity go really has a simple and concise syntax and you'll see that in the code examples it is lighter on the page than a lot of statically compiled languages it's going to have more of the look and feel of your dynamic language again you get the statically linked binaries which simplifies deployment and in terms of language features you get lexical closures function values built-in strings and maps and so on and that built-in concurrency which will spend some time on later now I claimed claims that go is simpler now it can't be simple if we just took an existing language like Java and added stuff that would be Java plus plus that's what Scala write we have to leave things out or take things away we really think that there's a smaller simpler experience for the user because fundamentally if you have to think about all these choices all these different things that could be doing that that's that's paralyzing right so by actually constraining the language you get something that allows you to be more productive so go has no classes and no inheritance that doesn't mean you know about you can have you can have types that have methods and interfaces but it's not quite the same as have classes and errands there's no Constructors there's no final modifier you have compile-time constants but no final on your data types there's no exceptions I'll show you how we do error handling and go there's no annotations there's no aspect-oriented programming in that sense and finally there's no user to find generics and this tends to be a big sticking point for people considering go they think how could you possibly have a language in 2016 that doesn't have two generics after all Java added generics in Java one for back in 2004 so clearly they've figured out it must be a good idea why haven't you and this has been an active area of debate in the NGO community for several years and within the NGO team what's interesting is all that time during all that debate we've gone ahead and built big production software quite successfully and the pinch of not having generics has not actually been nearly as significant as you might expect I'll show you in the examples how we build something interesting and part of what helps you is we have a few built in generic data types built into the language so that takes care of a lot of common cases and for the rest of it well you'll see so why do I leave out these features why make the language smaller the answer is clarity is critical NGO is optimized for the code reader over the code writer this is because you spend a lot more time maintaining your software than you do producing it so when reading your code should be clear what the program will do so there should be it should be relatively straightforward also how to implement certain behaviors in your in your program you idiots and conventions are good they help comprehension and so when you're writing a code it should be obvious to me obvious how you what you should do to make the program do what you want right sometimes this means you're going to write out a loop instead of invoking some some obscure library function and that's okay the don't repeat yourself mantra certainly has value there's good reasons to reuse code obviously but don't dry out don't it can be overdone and it can force you to have to learn say a bunch of helper libraries we're really wanted to write three lines in line and move on with your life so the there are some interesting philosophical differences between goes approach and what's come before if you want to find out more there's two great talks by Rob Pike on these less is exponentially more and language design and this service of software engineering I strongly recommend those talks all right so let's get to the code so again go looks familiar here's hello world you start with go programs begin execution and package main function this is all very familiar right package statement at the top import statement we import the package pumped that's our package for formatting strings and in the main function it takes no arguments you we're calling the print Len function from the phone to package printing out hello world go source code is unicode so we've got our nice chinese hello world there so that I just click the Run button that compiled and executed my program on my laptop so I said goes for servers so here's a server this is a web server that is going to serve that hello world text on the end point hello in our main function the first thing we're going to do is register a handler for the hello endpoint calling the handle func function from the HTTP package and we're going to say that we're going to handle that endpoint using the handle hello function which is defined here below the next thing main is going to do is going to print out the port host imported serving on and then we're going to block and listen insert and tell it which host importa to listen on and that function shouldn't return unless something goes wrong if we're unable to bind the port it will return and will log fatal will dump our stack stock traces out so we can analyze the error handle hello the signature is now determined by what handle func needed it takes two parameters a response writer that's the object is going to that's going to let us write the response back to the user and the HTTP request this star here this is a pointer so go has pointers they are safe and to think of them like object references in Java we're going to print out our URL and then we're going to write the hello world string directly to this HTTP response friar now you may have noticed that and go the type declarations come after the variable names this is you know if you remember old languages like Pascal this is familiar for those of you who grew up on Java and see this look a little strange but you get used to it very quickly and there's good reasons for this you'll see in some of those tight type declarations later the other thing you'll notice is they're sort of very very in capitalization throughout so go has only two kinds of visibility we don't have public private protected so on so on and so forth there's just exported and unexploited exported means things outside your package can see it unexploited means they can't write so it's the equivalent of package level visibility versus public and so capital means export it in a lowercase means not so you don't need any more key words you just name it this way and this isn't just convention this isn't forced by the compiler and it works surprisingly well so let's wrote out a little server we're serving on this URL open that up there's our hello world great and you can see here here's that log message we printed with our URL all right so here's the example we're going to work with through the rest of the talk this is a fake Google search front end so the the handler here is going to call a function that will fake out some Google search results for us and then render those results to the handler in a few different formats and when we get to that function that renders the Google search results we'll talk a lot about concurrency and how we actually fetch results and deal with them from a distributed set of backends so here this looks just like our hello server except for the end point of search and our handle search function has a contract that it needs a query parameter Q and optionally you can specify an output format which is JSON or pretty JSON so let's run that server and if we just click that first link we're going to get an error message we're missing that query parameter click that second link here we get some rendered search results we'll see these again a little later we get a web result for the query golang which is the go programming language home page we get an image result to go go for and a video result for another great talk concurrency is not parallelism also rub pike and you get a little text at the end saying we have three results in two hundred some-odd milliseconds that's a fake time if I reload a few times the time will change and you'll see how that's happening later so I'm going to walk through some go code and go cindex here this is to help you guys get to know it familiar with it first thing we're to do is log our URL and then check for that queue query parameter here we're calling a method on request called form value we're asking for the query printer parameter Q and saving in this local variable query this colon equals syntax is declaring and initializing a variable in one step so we don't have to state the type here the type of query is determined by the static type of the right hand side so the right hand side here is this funk is this method call and you can see it declared below here in package HTTP we have a type request and we have a method which is a function on request this is the receiver called form value that takes a string key and returns a string this means that query is going to be a string right so this is a the tiniest bit of type inference there's no Millner type inference here this is just local saving you having to declare the type when you're declaring local variables well it turns out to save you a lot of typing if the query is empty then we're going to render this missing the Q parameter error to the user just calling this helper function to write that response to user with the HTP status code great this is all straightforward so now we're going to fetch the search results here when you use a package I'll show you later the package is called Google here what you're seeing is an import path for a package and go the first part is path going that or slash X slash talks is the identifier for the repository and how long that is depends on a repository this roughly maps to a github repository the next is just a path within the repo and finally get the package name we have tools that can automate fetching these packages to your local to your local workspace so given a go program we can automatically fetch all of your dependencies and install them for you in our hand returning back to our handler we're going to run the Google search we're also going to time how long it takes so going to produce two things well first we're going to run Google search on the query and it's going to give us two return values so go we can have multiple return values and a very common kind of last return value is an error so here we get the results we have an error results is a slice a slices like a global array or a vector slice of search results and I'll show you that type later and the error is an interface value interfaces and go are interesting all an interface is specifying is a set of methods that a type must implement okay so it's really just a constraint on what you can do with that type but you do a runtime type information you can then do type assertions to find out more about the underlying type and the value so right now since we've just got an error out of this call all we can do is call this airstrip error method to get a human readable string and that's all we're doing here if we get an error if the error value is not nil we're just going to print that string to the to the user and report an internal internal server error but say we wanted to learn more we want to find out l Ches does this what more can I learn from this error in terms of structured information is that it was there timeout or something else you'd use type assertion to find out more about that either using other interfaces or if you know the concrete type you're looking for so now that we have the search results we're going to do a bit of strut we're going to put them a little structure so that we can integrate with the rendering a little bit more easily so we're going to declare a little inline type this is a type declared locally inside our function body it's just a struct called response that has two fields the slice of results and the elapsed time which is a duration and then we're going to declare a value called RESP which is that response and here we're just specifying a little struct literal with the results in a elapsed time we got from our call and so this composite literal syntax is very very handy you may remember it from C you can specify using positional struct fields we can also specify you know results : to do the separate fields and the rest will be 0 the same syntax works for maps and slices and this is again something you would see in a dynamic language right you're used to seeing this inside JavaScript but it turns out to be just as useful in statically compiled languages and is particularly useful when you're building things like protocol buffers write messages to send over the wire you can specify as a struct literal you can really see what the message is rather than constructing it with call after call so this ability to sort of layout your memory frankly or lay out a value structurally turns out to make your program quite a bit more comprehensible right finally we're going to render that our search results remember we had this output query parameter we're going to request that with this rect form value we're going to switch on that value you remember that is going to give us a string so we're switching here on the string each case is a string we've got a case for JSON a case for pretty JSON and a default in the JSON case we're going to use the encoder from the JSON package and encode the response now a response is what it's a struct with two things a slice of results and an elapsed time okay and so when we go back to our server I think I have a little on the next slide there we go so let's run our server and open up our URL and see our JSON result so here's that same results we saw for the web but in JSON and you know it's kind of hard to read it's all packed together but you can see there's a results and there's a title something so let's render our pretty JSON instead and that was just another case there all right there we go this makes more sense if that'll ever disappear I don't know why it doesn't want to give make the title bar disappear but we get our slicer results the title URL so Jason so what the go JSON package did is it took that go structure that go value and simply mapped it with a reflection over to a JSON value and of course you can do the reverse on decoding side so you have a bridge now from your static typed space and go to your dynamic space in your wire space in JSON so it makes that integration very very smooth so I showed you we have a regular JSON we have an indented form which takes a little bit more information and then we have this response template to run our HTML so go has its own templating package there are third-party go packages for standard ones like Django and so on each of these cases is taking the response and it's doing two things it's setting this error if there's an error and we check that later and it's on success it's writing to that HTTP response writer now the JSON package and then templating don't know anything about the HTTP package this interaction is mediated by an interface this is again nothing new in Java it might be output stream write and go this is an interface called IO writer again an interface is just a set of methods and IO writer has it's a set of a single method contains a single method right which takes a slice of bytes and it returns the number of bytes written in an error and so this HTTP response writer type implements that interface and that's how this all gets joined together all right finally to actually render those results this is should be pretty familiar to those use those of you who've used HTML templates we can range over the results produce a HTP list item for each one with the title and URL and then print out the results in the elapsed time and you can see a rendering below alright that's it for the search handler that was frankly just the get introduces to go syntax next I'll go on to the fun stuff in this case this one to say that everything I've done so far the packages from the standard library the standard library is very good we build google production software on top of this and we make it available free to the world so use it it's excellent the other thing to mention is that search handler was just straight line code top to bottom everything it did there were no callbacks right there were no futures and this is where you can really get into the power of go go gives you very very lightweight threading and a very efficient runtime for swapping out and scheduling those lightweight threads we call these threads go routines and go servers can scale very well we run each request in its own go routine so you can have hundreds of thousands of these things running and when they block they don't use an OS thread so you're scaling scaling very well so let's talk in more depth about how gos concurrency works gos concurrency is inspired by CSP communicating sequential processes introduced by Tony Hoare in 1978 CSP states that concurrent programs are structured as independent processes that execute sequentially and communicate by passing messages sequential execution is fantastic I know threads get a bad rap but when you have them and they're light and they work there is nothing better than having a stock trace to understand what your program is doing and being able to read code straight down and have that work async call back spaghetti is not your friend it kills your Diagnostics it's very hard to understand you're essentially taking everything your stock what are giving you wrapping it up into objects and passing them around so go brings back the sanity and will go into a deep example to show you that gos built on three primitives it goes concurrency go routines channels and the Select statement go routines are lightweight threads managed by the go runtime so go is an m2n throat threading model which means you can have hundreds of thousands if not millions of go routines running on a very modest number of hours threads and you just tell your goal programming how many oos threads to run and it schedules the go routines onto those and a block go routine doesn't consume a thread right so go tune start with very tiny stacks i think the default right now is a two kilobyte stack and they grow dynamically as needed so you don't worry about sizing your stacks you don't worry about hitting your limit and the syntax restarting the go routine is very simple instead of saying F of args to run the function f you say go F of args and that starts it in running a new girl routine and the code that executed that go statement just continues executing immediately now it's not enough to have these lightweight threads they need to be able to synchronize and coordinate and this is where channels come in remember from CSP independently executing processes that communicate by passing messages and go channels provide that perimeter but you can think of them like typed synchronized cues okay so if I make a channel of string that means that I can pass strings around on that channel so from one go routine I can send the string using this arrow operator so here I'm sending the string hello on the channel name C and so channels are also just values and go you can even pass channels around and that's useful for certain design patterns from another channel from another go routine I can receive from C so here the arrows on the other side and that I'll receive that string s and I can print that out that's that same hello strings so you can pass around strings you can pass around maps you can pass around pointers right you can pass around the memory or sharing right so this gives you a way to safely share memory because you're essentially transferring ownership between distinct threads of execution channel communication is synchronous a send operation will block until another go routine is ready to receive and our co-operation will block until a signed Sun can happen okay so it's also a synchronization point it's a very powerful primitive you can loosen that synchronization by adding a buffer to the channel and I'll show you an example of that later finally there's the Select statement the Select statement is a way for goroutines to block on multiple communication events it's a lot like the Select call on UNIX it looks like a switch statement so here we have two cases the first case where select we're waiting for to receive a value from the channel called in in the second case we're waiting to send a value V on a channel called out this statement will block until one of the two cases can proceed so if the in case is selected if that happens then that value will be stored in this local variable X and we'll print it out if the out case happens we'll print out the fact that we sent that value and only the selected case runs the other one is guaranteed not to run if they both have them do you ready then when we choose a case for and employ all right so those are our primitives let's see how we put them together to solve an interesting problem you remember that our search handler we call that Google search function we handed a query and it gave us some results but that's not that seems kind of simple what does Google really do well given a query we return a page of some search results of some ads but we get those search results by querying a bunch of different repositories web search image search YouTube maps maps and use and so on then we mix them and order them to produce what we think is the best cut result customized for the user alright so how would we actually implement this in our little server if we wanted to query these backends and assemble those results so we're not going to build Google here but I'm going to fake it out for you we'll start with a fake framework where we're going to simulate three backends each with a random timeout from 0 to 100 milliseconds so we've got three declared here web image and video each one we're constructing with this fake search function right and that fake search takes three parameters the kind of search web image or video and I fake result a title and a URL okay and this fake search function here's the three parameters kind title and URL strings it returns a search funk search focus just to type I declared here it is itself a function from queries to results so how we specify it here we can specify a function in line with a literal and this is a closure it's closing over these kind title and URL arguments so what this function is going to do is going to sleep from a random duration for up to 100 milliseconds and then it's going to return a result with the title it's just going to tell us the kind in the query in that and the query in the title and then the URL is a second element of that result so that's our fake and let's test that out real quick here's just a little test driver program we're not gonna run the server it's just going to time how long this takes and print out the results the elapsed time in the error so we run this we get our web image and video results in 256 milliseconds nil which means no error if we run that repeatedly we get the same results in a slightly different time and if we run this a few times we'll notice that this total elapsed time hovers around 150 milliseconds and why is that well we're taking three random samples from 0 to 100 milliseconds and we're adding them up and the reason we're adding them up is that our implementation is serial our search function is taking the query returned a slice of results and so here's our implementation we're going to declare our results slice here again using literal web is going to execute first then image then video and then we return our results so when we run that we are adding up three things that you know going to average around 50 milliseconds and so we expect anywhere from 0 to 300 milliseconds averaging around 150 that's our baseline so this is go that we can paralyze this so let's see how we do that we want to run the web image and video queries concurrently so we're going to run each one in its own go routine and we're going to collect the results on a channel so we declare this channel C as a channel of results and we're going to start three go routines go go go and each go routine is going to run one of those searches and send the result on the channel C now you'll notice here I said earlier that you just say go and then a function call here I'm declaring the function in line as a closure with no arguments and I'm calling it with no arguments so closures turned out to be a fantastic mate with gos concurrency model because it means you can bind synchronous calls like this web web search call to a channel in as just a one-line simple expression so it really couples quite well and it means that the signature of web is just a simple synchronous function it just blocks and returns a result and here we're executing it in a concurrent context and when we return we're going to return a slice of three results and those results are three receives from the channel we want to receive one then two then three so let's run that and see what we get all right so first thing here's our three results the order changed right we got image than web than video well what order is this this is ordered fastest first so image took the shortest amount of time so we received it first then web then video and it turned out none of them took longer than 15 milliseconds so we got lucky there if we run this a few more times there's 45 milliseconds you can see the order is shuffling each time and you can see we're staying at or below 100 milliseconds and the reason why is because the slowest one is only ever going to take 100 milliseconds sets as long as we're going to take because we're running them in parallel so let's look at the last slide in the slide again here's the last slide web image video here's the parallel slide we have image video it's barely longer if longer at all and I could have cheated made some of the more one-liners so goes syntax for doing the sort of concurrency is very lightweight this is very powerful but we can do better now at Google we care a lot about speed so let's we've decided that a hundred milliseconds is just too long people are noticing we've got to go faster so we're going to have a new version of this search that takes a timeout the timeout is a another parameter time duration and we're saying that we don't want to wait any longer than 80 milliseconds so we're going to call search timeout with an 80 millisecond timeout and so we want this function to return at 80 milliseconds so we can't wait for the slowest guys if they do take longer most of the code is the same we have we start a timer right at the beginning of the function and this time not after a function it returns a channel right at friend's channel we saved it in this variable timer and that channel the time package is going to send the value on that channel after the timeout elapses now why why do we do it this way well we need to receive the results from this channel C and but while we're waiting for the results we also need to be checking that the timeout hasn't expired yet we could just try pulling the timer in between receiving results but then we might exceed our deadline if one of them is really slow we need to wait for both simultaneously and here's where select comes into play so I've rewritten this as a loop each time through the loop we're going to select on either we receive a result from the channel in which case we append it to our vector of results or the timer fired in which case we're going to return an error timed out in this case I'm also returning the results just so you can see the subset of results received typically though you wouldn't bother returning anything here because the call failed if we get through this loop we received all three of our results we're complete and we turn the results in no error so let's see how that works okay here it took 82 milliseconds we got image in web and we did not get video and we got this error message timed out right so let's run that a few more times 62 milliseconds all three results 85 milliseconds time down 85 milliseconds this is timed out right so we're timing out quite frequently here when we take those three random samples they're still fair likely that one of them is over that 80 so this is a problem our users are complaining so how do we avoid timeout how do we avoid discarding results from soul servers well what we do is we throw money at the problem let's replicate our search backends and take and so we'll send our query to multiple web backgrounds multiple image backends in multiple video backends and take the first one from each of those repository replicas so let's define a little helper function I called it first first takes any number of replicas this dot dot dot syntax is ghost syntax for a very attic number of arguments and it's just available to the function is a slice you just you can index it and range at the same way you would any other slice so first it's going to take a list of searches search functions and return a search function the function it returns is going to create a channel of results it's going to create a little closure here to run the earth replica and send its result on that channel it's going to start a go routine for each replica then it's going to return the first thing it gets from that channel done but there is a subtlety here when I return those other girl routines are still running returning doesn't make them stop or anything they still go so how do we deal with the fact that they still want to send a results on the channel but we've walked away here's where that channel buffer comes into play what a specified here is that this channel can absorb up to the number of replicas rights sands on the channel without blocking and so this means that our other girl routine is going to keep running they'll send on that channel they'll absorb the channel absorb the rights then everything will exit and clean up so let's see an example how to use this I'm going to call first with two fake searches replica one and replica two so here we got replica one one and fifty four milliseconds there's replica two and thirty seven if I run this a few times we should see a roughly balanced we should also see that the average should be a little below that fifty milliseconds because we're taking the first the smaller of two random samples right okay so that's our first function let's put it together with our other with our search function so here's search replicated we're going to declare a replicated web to be the first of web one and web to the same with the image same with video and then we just replace web with replicated web here and image and video so we run that so here we got all three results we got the web one result the image one result in the video to result in sixty four milliseconds and if we run this a few times hopefully so we can still get a timeout in this case web timed out and this means that both web 1 and web two took longer than 80 milliseconds so it's still a possibility and we can make other trade-offs to try and speed that up right but we've greatly reduced the frequency that timeouts are happening we've dealt with our two latency another thing I want you to notice the signature of search replicated here this is synchronous right it's just a function call it takes arguments it blocks it returns the results the signature of replicated web takes an argument returns a result there's nothing about channels or callbacks or futures in the signatures here the code and the interfaces are simple all of the use of concurrency is encapsulated within these functions this is one of the critical powers of go is that you can have simple code that does very sophisticated concurrency so what just happened we took a slow sequential failure sensitive implementation of search and with fairly straightforward transformations made it fast parallel concurrent replicated and robust no locks no condition variables no futures and no callbacks now obviously we're using channels and we're using select and go but that's built into language for exactly this reason we believe these are the concurrency primitives that let you compose programs that do real work real well alright so hopefully I've what your appetite for go hopefully i've gotten you interested in the language and you want to do more or do you get started start with the go to online tour golang.org it is an online tour in your browser that walks you through the language from the basic syntax through to the concurrency then if you want to learn more go to the learn wiki page it has hundreds of resources that will take you as deep as you want to go now if you're still interested I want to talk to you about an important technique that we've discovered while trying to get a programming language adopted both within Google and outside and it's running a pilot project so as manager of the team and as I've worked over the last you know several years really to get teams within Google and outside Google to adopt go I've had to understand what their problems are when facing with this as I said you know Google is an enterprise we have a huge legacy codebase it's not just you know adopting any new technology language or not is expensive and difficult and people are skeptical you want to moderate the costs and the risks of switching any new technology and this is where pilot projects come in you know I know I realized many probably already know this but let's just go over it you want to help your organization discover the benefits of your technology while moderating the risks so step one choose something small to write and go a service a server or micro service a tool command-line tool you can either be a rewrite or something new but something moderate that you can do with a partner find a friend and build a prototype and you want to use this experience to find well are the libraries you need available ok how do you integrate with the editors and IDs your organization is using how to integrate with your build and test and deploy cycle you want to spend some time with your program how are you debugging it how are you profiling and optimizing it and then compare with the fair amount of skepticism go to what you're using today try your best to have isolated just the language changing keeping everything else equal to the extent possible make this an a/b change I realize it's not 100% possible but that's the goal and then present the results to your team in organization and discuss we believe this is probably the fairest way to evaluate whether this is really buying you anything I hope in my presentation I've shown you that it's a powerful language I think if you think through the example I just showed you and the languages you use today you'll see that it's quite a bit harder in most languages to get that sort of behavior we found this to be very powerful at Google final mention since I've got a little bit of time go is designed for tooling this was in the language design from the outside it's a simple syntax that's very easy to parse and it's very easy to manipulate mechanically so one thing about go is that the formatting is standardized by a tool called go fault everyone uses go fault and this means that machine generated code and human written code are indistinguishable so this is hugely valuable right because you can automate a lot of the maintenance of your code base editors and ID's there's existing integration for all the big ones Eclipse IntelliJ Visual Studio sublime Emacs vim they're all in wide views let me show you the go playground real fast plays out golang.org you know this is an online a playground for go it'll compile and execute your code and this turns out to be really useful for reporting bugs for exchanging ideas getting another language and let me this lets me show you a couple more tools so I mentioned go font right so if I have badly formatted code I can just run format and you know it'll for you know do the right thing and this is integrated with all your editors and IDs already and if I want to do obviously can I use log I can run go format so what I did right there is I said run go imports and so it recognized that I'm using the log package instead of the font package and it remove the foam to import and added the log import so this is another way that go integrates workspace really nicely you just write your code you save your buffer it gets formatted and the imports get updated you move on with your life it's fantastic if you want to share the snippy dictionary share you get a URL to share with your friends so this turns out to be a nice little helper when you're getting the note language and when you're reporting issues there's of course automatic completion automatic we have a go tool for automatically fetching and building your code it helps you manage your workspace and deal with remote packages and then we have a cool tool called guru which does all of our static analysis it gives you bug finding it gives you code navigation and this has various editor integrations as well finally there's gorg this is an index of all the generated source sorry generated documentation for go open source go packages on github in and other apposite orys so you know we could search for our PC and find my internet connections upgrade we can find a bunch of our PC packages here's G our PC which is the Google G our PC packages based on our internal stubby implementation it's available in 10 languages and it's how we are talking to Google cloud services we have a really nice API and go so you know this is familiar from Java doc hi doc but it's available as well and go on is a great way to find the libraries you need alright so again take the tour online check out the learn wiki page and if you are into contributing to open source communities you are most welcome we have a great community for go check out go lining that org slash project and get started there thank you very much so at this point I'm seeing little Gophers flying on rainbows and I'm ready I'm reading the kool-aid but I I would wonder what clues would I look forward to know that go is not the right tool for the job excellent excellent question Wow you know the guy who's gonna present on rust tomorrow is probably better one to ask ya know so what we really run into with fundamentally teams that run into friction so I actually spend a lot of time thinking about this internally at Google because one of things I have to do in prioritizing what the go team does is we go to our users and say what are your pain points what's keeping you from you know being successful with go part of it has been internally a lot of working library completeness and efficiency so I think go as an excellent standard library that's very complete and obviously we can build my software there internally people are pushing on go very hard in very high performance high throughput applications we have a lot of optimization work to do so there's a question of how hard are you going to push and do are is what's available publicly does it meet your needs your performance and scaling cost needs so that's one thing we're a prototype always helps it gives you something to measure and also just finding the libraries you need you want to make sure that if you need to integrate with particular protocol x' or particular templating systems or particularly you need to find the touch points of your system your operations team will have a strong opinion here they are going to want to see a particular production contact surface for your program you want to make sure that you can provide that so that was a lot of our initial work on go inside Google was providing a production contact surface that looked just like the ones that C++ and Java had so that our sre teams became very comfortable with it so that's the sort of thing that I think that pilot project can really help you what you need to explore this and so that that's one of the main things I think so showed you on the slide earlier that we that go we think presses up in that developer productivity axis and we spend a lot of our time on the team pushing it sideways on that efficiency axis but we're still not going to claim we're as efficient as C++ or C yet right we want to close that gap but that's another thing they need to ask yourself if you're currently using C or C++ because you need every inch of performance well you know you'll have to measure please go does go meet your needs but if you're currently covered but if you're doing that anything less I think go actually can meet most performance needs and it does particularly well in distributed systems where you really are managing a lot of i/o yes I Sam your uh quick question for you I know you mentioned that there's no one IDE for go um and I've been using go sublime for a long time and I'm quite happy with that but my standpoint all the languages have used C++ Java and arguably Python they've all gone through a period of time where one or two IDs dominated and I feel like that helped them spread out and make them more popular I was wondering if your team is actively working with anyone for this one Society we're certainly collaborating with these various external companies so I think the intelligent company and we talked when they start doing their integration I can't talk too specifically about what the go team is working on I can't say to our general philosophy is to provide tools that can integrate into other development environments and encourage them there so that doesn't really answer your question yes I'm just curious the CSP paper initially appeared in 1978 was there like a key technical innovation that occurred recently it allowed this to become a more mainstream construct or is it kind of a case of unearthing a hidden gem that was so lost for time these are days you're exactly what these ideas are very old they have appeared recently it earlier Lane and the actor model and scholars actor model darts I think are all reflections of the same CSP model go is distinguished a bit by some of Rob parks earlier work with channels as first-class values so in say Erlang you're going to talk to other actors based on I think an address for their mailbox whereas in go you're going to pass these channel values around and that's your synchronization point it's a subtle distinction but allowed but it allows for different idioms and design patterns I think these languages have been very old and so the idea is why ncsbn very old but the need for them has become more keen lately so if you saw the architecture talk this morning Muhammad's talk he talked a lot about the need to take advantage of multicores and I feel like we know that every day developers are feeling that pinch the need to really take advantage of multicores but we don't want to convert all of our algorithms to parallel algorithms or to vector vectorized stuff and what I showed you here is that we could take something where we just have synchronous functions we just want to run them in parallel and just sort of spread them out and then gather the results and that I think it's a lot easier to wrap your head head around and so I think that one thing that distinguishes this approach from from what you've seen it before in many ways that that bit of syntax is sort of incremental over what you seen before buts the whole runtime underneath it that makes it efficient that's the big jump right but you know Haskell Erlang Scala they've all have various kinds of concurrency built in as well so there's a long lineage Rob Robert Greece Omer had a great talk at the last year's gopher condom goes lineage and if you're interested in the history there that's a great one to check out oh I I would have expected better performance on your first curve in terms of why is it similar to Java given that it's lighter white and in and has pointers and is native code it's a good question Java is about what 20 years older it has a you know a lot of work in the JIT compiler and garbage collector that I think have made it I think otherwise Java probably would have been over here right and so I think Java has pushed itself rightward quite a bit over 20 years and so go I think a starting comparable to Java but we have we can push right word from here yeah it goes only six years old at this point I and just got to say it love your quote about code should be written to be read yeah mine was talking about the pointers a little bit more and why in you know go that was something he decided and just a little bit more information about that you kind of talked about it was on so yes I actually thought I was very going to be very tight on time with this talk so I Boston going fast so pointers fundamentally allow you to distinguish between inline values in indirection right so an NGO you can actually lay out a flat struct that as you know that as a nested structure that is entire entirely linear in memory right but a pointer is literally one and in in in direction so if you want control over that and how is your if one controller memory locality go gives that to you so that's the critical difference in that if you want if you want a reference that's where you pull in a pointer but if you want to just lay out flat memory so you have that efficiency of access you just lay out a struct or an array or whatever so I've had a chance to play around with a go and I've been following along with it for quite a few years finally at least getting the proverbial feet wet as it were between picking up on some projects and also developing on my own and one of the things that I have noticed it's more of a trend is more of the transition away from using go as a dedicated compiler to more like a full-featured build suite as it were where you're using it for a retrieving packages or at least the go tool where ease yeah right ok so one of the things that I had noticed especially fussing around with kubernetes is that the dependency mechanism for trying to automatically bring in the packages that you need without having to check them in as well are still seems to be more hit and miss I think it's it's probably too simplistic for most real-world use cases like it's great for getting started and rapid prototyping but a lot this is I was just talking to someone earlier today and that name escapes me at the moment that a package management and dependency management is certainly an area that we need to actively develop there's been some work in the last two releases to better provide better support and go tool for vendor but I think there's a lot on dependency management and package management that needs improvement and I think this is just a maturity issue I think it's also partly reflection that the go team works inside of Google that simply doesn't have this problem and so we are not the ones best suited to design a solution yeah right trust me the GoGet aspect is come in real handy but being able to at least record that kind of list and using something like go deaf which may or may not look in to the vendor library right or at least referred to where you're supposed to point to I think there's room for organizing the community effort around parish and dependency management to help standardize and get something decided because I think that will allow everyone to move forward it's something we need to do okay thank you hi thanks Barack yeah um so the go channels seem a bit like green threads the go routines yes yeah very much um do you have to worry about managing like goroutine pools for efficiency or not really uh you don't have to manage explicit pools unless so what was the talk we saw earlier this the the the HootSuite talk that was talking about managing you know how much memory you want to use in sort of concurrent so you can certainly create a penal go routines you can start say n go routine and have them all receiving from the same channel so channels are synchronized they're safe to be received by multiple go routines so for example say you are reading files off a file system you want to process files in parallel if you just iterate over your file system so you want to run the md5 sum on every file in a directory if you just iterate over your file system and start a go routine for each one you could easily pull in more memory than you have in your machine by reading the files into memory right so you may want to bound that now you can bound that with a semaphore or you could just bound the number of governs they're doing reading so you start up little-girl routines and you feed them files over a channel so go makes it very easy to construct it so restriction so it's not about controlling the pool or controlling the runtime fundamentally it's more about just controlling how the data is flowing through your system and I think the resource management is a tricky problem I think you in many cases what you really want is a semaphore that models your resources and use that rather than a pool because creating and killing goroutines is lightweight enough that it's fine just do it all the time yeah yeah it's it's very it's very light yeah I I haven't noticed anyone that being a particular performance problem because we start a new Guardian on every RPC like every RPC Handler on the server we starting to grow a team so we're serving thousands of QPS that's fine yeah thank y'all very much oh one more yes sir hopefully I'm like this quick um so I I actually have done a little go programming and had kind of moved away from the language just a little while and one of the reasons that I was I was in that scenario was basically that and I actually noticed it was missing from a list and this could just be that I don't know enough about the language but you'd said something about Diagnostics and I was wondering on the list considering that go changes the runtime model who thought it was convergence pick ewis that a debugger wasn't on the list and is that some so that yeah so so there is a go-to burger called delve that's been developed by an external party and it's been maturing nicely there's also a number of built-in Diagnostics in the runtime so there's a CPU profiler there's a heap profiler we have nascent support for heap dumps that we're working on and heap inspection and then there's this really cool thing called the execution tracer the execution tracer if you're familiar with I think it's Haskell thread scape where it shows you how the Haskell sparks are executing on the CPUs this is something very similar for go where you can actually see a trace of which CPUs your goroutines executed on which functions are executing how they were blocked on channel operations or mutexes and so on so we have Diagnostics both for the the gross level you know where's my memory being allocated from and where's my seep where's my program spending its time this is a very fine-grain yeah thank you sure all right thank you very much smear op there's a 20 minute break now coffee will be available in the room across the way the last session of a day is a joint session which will be upstairs so a lot Brandon Greg will be talking about a performance analysis hope to see up there
Info
Channel: Association for Computing Machinery (ACM)
Views: 63,387
Rating: 4.9629202 out of 5
Keywords:
Id: 5bYO60-qYOI
Channel Id: undefined
Length: 62min 6sec (3726 seconds)
Published: Mon Jun 20 2016
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.