DC_THURS on PyTorch Lightning and Grid.ai w/ Will Falcon

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
i'm pete soderling i'm the founder of data council in the data community fund and we've been pleased to be able to bring you these programs um almost every week we typically have a great open source leader in the data community and or the founder of a tech startup in the data ecosystem and we're super excited today to have another great guest on the show it's will falcon who is the co-founder and ceo of grid which is a new company that's popped up in the deep learning ecosystem uh will is known probably most recently for being the the core author of pi torch lightning which is an open source deep learning framework and we'll um you have a pretty interesting past so i'm just going to like list a few things and then you can tell us a little bit about how those things do or don't fit into data so you were previously a researcher at facebook ai research and a phd student under yon lacun at nyu but before that you worked at a bunch of different startups you co-founded a startup called next-gen vest you're a software engineer at goldman sachs and in a previous life before that you're a captain in the air force and you also trained as a navy seal so um let's start with the navy seal stuff first because that's most interesting and probably the biggest outlier um how did how did that happen and how did you get from being a navy seal um and going ultimately ending up at facebook ai research yeah um all great questions so first of all thanks for having me pete so always excited to to chat with you um yeah you know like i think um from a really young age i always you know i grew up in latin america and there's a lot of stuff that happens that you're exposed to there right so in venezuela specifically and i was always passionate about trying to figure out how to make a difference there and you know when i was in high school kind of the thing that was top of mind for me was you know at least a thing that was accessible to me to be able to make a difference with the military and that was at a time when like you know iraq and afghanistan all these things were happening and i thought that the the most relevant way to get involved and like you know have a positive impact uh you know us you know whatever your political views aside are um you know as a as a high schooler uh with you know some so many limited options these are some of the things that we think that we can do really well and um i thought that the seal teams were you know at the right place at the right time they were helping in in key areas um especially in latin america as well and so i was really drawn to it so i so i joined the navy and i started going through seal training and you know so training is like a very long process uh so i was there for a while and i fortunately got injured a few times and this was happened to coincide with the time and uh in the military where they were downsizing um with you know pulling out of iraq and all these different things and so basically administratively if you hadn't finished training which was me and you were still around which is many years of training then they were like okay do you want to be an intel officer or do you want to fly planes or do you want to get out and i was like okay well i came here to be a ceo so like i guess if that's not gonna happen then i'm gonna get out and do something else so um but it was you know i was there for about four years active and then two years reserve and learned a lot right um spent some time with one of the seal teams um learning what they do downrange and uh working alongside them in an intel capacity and yeah it just changes your perspective a lot on you know what's possible what isn't and you know stress stress um i guess levels out sets a really high bar for that maybe some um some unexpected but interesting lessons towards being becoming a startup founder later on yeah i mean there are a lot of parallels right like i think the biggest thing that you learn in seal training is to just roll with the punches like you know you show up every single day you don't have you don't know what you're going to do i mean you kind of know generally but you don't really know um you're not allowed to have watches you don't know any concept of time or anything so no one controls your day like they control your day and so you just have to show up and then you know imagine imagine when you go work out today you're like you're ready for a run and you're like okay i'm gonna do a three mile run and you know how about how that's gonna be imagine if for like six months every every time you showed up somewhere you didn't know what you were gonna do i'm like literally it could be like you just start running and then you're like are we doing a two mile or ten mile run you don't know you don't know until you're done right and are we sprinting or not it doesn't matter right so what that teaches you is literally just roll with the punches at every single moment like it does not matter what's happening you just go and do it right and that's that is the key to startup worlds right it's the ability to on a dime pivot or switch or be okay with complete uncertainty and you know that's one aspect of it and to do it everywhere um you know they simulate like combat they simulate a lot of stuff like that as well and and then when you're actually working with the teams like these things happen very dynamically right so that these are very dynamic environments so yeah a lot of lessons that's uh that's really incredible insight and i can see how that can be applied to some of what you're doing today so were you and were you technically inclined at the time um when did engineering and cs and stuff sort of you know come into your career yeah so i wasn't i mean that never coded didn't even know you i mean i guess i knew you could code i thought a thing but like i didn't know who did that i i definitely didn't um and i you know i i didn't really have a lot of math background either i think i had the just standard high school algebra but i never you know in high school i was training to be a ceo so like i wasn't focused on academia or anything um and so not really i think um what happened is you know i was planning doing 20 years in the in the teal teams and when i left i was like okay well what am i going to do now right like that's kind of out of the window and um and i got into finance i had been working i don't know why like from from a very young age i really liked trading stocks so i like knew like a weirdly like a weird amount of i had a weird amount of knowledge on this and i ended up working in finance and during that process i i was like man i feel like i was doing repetitive work over and over again and so i started trying to figure out how to automate this i like i think it was like vba the first thing i started trying to do on that side and then i started looking at java and then objective c and then that led me to iphone apps and then i was like oh this is cool and then i just really got into it it was like my night and like nights and weekends like that's what i was doing all nights and weekends um and then just kind of picked it up that way and then once i got to uh to columbia for my undergrad then that's when i i really focused on it got it got it and then was one of your early jobs out of columbia was that when you went to goldman and run the software engineering team there yeah yeah so that was uh that's a very interesting story so i was i was building apps i was consulting and you know i was trying to like launch startups right so i mean i think like we had this chat like a year ago or two years ago i think um we were on this panel and they were like oh you know next year invest worked out and i i don't think i'd started grid yet and then um i was like yeah but you know like there are like 12 other things that failed that like never made it to be anything like how many how many dead web apps were there how many iphone apps that i try to launch you know and so that's what i was doing during that period was building things launching them seeing what happened learn a ton right learn a lot um and and you know one of those apps i launched actually took off and it was it was something to like unfriend people on facebook using like a tinder swipe so you load this app and then you could just like swipe left and right so that took off and then um yeah that was an interesting app and then and then um and then you know from there i ended up getting kind of started getting recruited by facebook um and then um and then friends at goldman saw it and then you know had a great i guess it was good timing and then um ended up taking a job at goldman um doing that but i wasn't really looking for a job because i was still in school and so i took the job because it was like hey come work at the startup i know what a startup is now but you know it's as close as you're gonna get in finance um it's this team called marquis which is uh really cool if you go into goldman and they build you know a lot of new products and then they launch them they spin them off and do their own things and and so i got there and i was like startup people who are like trying to build like trying to move at that pace inside a bank which is hard to do but you know they were doing their best and um i was still in school though but i was doing part-time because i was still studying machine learning so technically never left during that period but you know figured it out i guess wild um such a such a colorful past um and so then you're you're building iphone apps um you started to sort of explore you know like real career jobs at in finance facebook was sort of interested also how did you sort of get pushed um you know deeper into the data science machine learning ultimately deep learning track when did that start to i happen it was around goldman so i had when i when i started at columbia i started cs but after my first semester i switched i had i switched to stats and uh cs and then later on math as well and mostly because i was more interested in the math so i was already doing a lot of statistics machine learning that kind of stuff before i got to goldman um but i wasn't doing it professionally so once i got there and i started thinking about it you know our teams did a lot of algorithmic trading and like systematic trading strategies but at that point they weren't really using a lot of machine learning it was mostly just math like a lot of the finance stuff uses like ffts and these different things um which which is cool but i was like hey i want to use deep learning and at that time frame the bank maybe they're ready now but the bank back then wasn't ready i think mostly from regulation like they were you know like neural networks are black boxes i don't think they are really that much but um at least in finance it's it's a little bit hard to explain why something happens so that was like the big hangout for them and um so i understood that there was like a timing issue there so i got into it aggressively then so i pushed for the time that i was there and then when i basically was blocked and couldn't get anywhere i left and then i started um like my company so um and then you know at that point like i met my co-founder and she was um she was working on like an earlier version of of this of next channel right so he had been doing it for about two years trying to get like educational material sold and a few other things and then we started chatting about oh can we help these people like over sms you know scale up the you know how do we do financial advice over over sms and so we kind of pivoted towards that and ended up building the company off of that after that and one of your talks yeah yeah um so this is next-gen best which which was providing financial advice for students as i recall right so was this was this partly like to scratch your own itch like you wanted to use deep did next year invest use deep learning um in some way so that this was your opportunity to start your own company build a product and sort of integrate this tech that you wanted to use in in a more formal career but um didn't really have the opportunity to in finance i mean that was one aspect but the other aspect was that i had also been you know had just come from transitioning from the military into into colombia and i'd seen the opportunity gap right so i showed up to colombia and all these people have had so much training and schooling before they even got there and i was like okay well this sucks so how do we give that access to other people so i was thinking about that and then i was also thinking about accessible finance so goldman like why why isn't um why something like banking for gen z right like younger people like why are they like what what products are going to be rolled out for them right so we were thinking about that and then kelly was thinking about that as well and um and then when we coupled that with machine learning and deep learning we found a way to scale that up right and so that's really what what we did with uh with next-gen best was scale of financial advice over text message um and it was early stage it was like early nlp it was pre-birth right so it was like a year or two before burke where we were using like seek to seek models and um like um what do you call them siamese networks that kind of stuff to do a lot of this i think the models work a lot better today for using transformers of course um but back then you know we were able to give you real-time predictions on what you should say next to two people coming in right so we had at a peak i think we had about 65 000 users um and then we only had like maybe 40 we call the money mentors uh across the us that were helping scale up the advice to those people which is a lot because they were we would have thousands of chats a day so we were basically human in the loop but being able to answer faster because of the deep learning and then you know once we got acquired i think they've grown massively after that uh so you know curious to see how they've scaled and what about the the data challenge that a lot of startups face where they don't really have enough data to science like how did you how did you train your models and how did you like what was there a critical mass that when you sort of had enough users and enough training data to really make these these uh algos work or how did you solve that problem or think about that at the time so nowadays you can use transfer learning especially for nlp right transfer learning from feedback then didn't work you couldn't do that so we actually we were i mean we basically ran the thing with that machine learning for like two three months to start getting enough messages right and then once we had that then we could trade models right but we had to like basically run with that models for a while um so you could you can solve it that way it's just like you know paid there's just like human human in the loop human in the loop sort of curating the messages back and just doing it manually until you had enough enough of a trading set to train them yeah so if i were to do it today i'd take a pre-trained model um like a gpt or something and then and then collect enough data the thing that you can do today is you don't need to collect data for three months right you can just do it for a few weeks probably and get it and get and then fine tune on that and you'll probably get pretty good results got it so so that was the next gen best story and then um you decided to go get a phd at nyu um did facebook was facebook sort of mixed in in there because you ultimately ended up working at fair as well right yeah so facebook was not mixed in there um i mean nyu has a very really close relationship with facebook uh because and then basically most of the professors in that lab have some at some point been research scientists on facebook uh at fair and um but no that wasn't that wasn't it so i had applied to phds and you know when i applied i was still with nexembest right so we were like okay um we hadn't been acquired yet and so we were thinking about the next stage of the company and i was like well we're gonna submit the applications and then see where we go from there right and then i got accepted to a few programs um but also nyu and nyu was in new york you know actually two blocks away from the next gen vest office so so one but second i mean one of my favorite researchers around ever is kyung hincho and chose one of my my advisors and i mean you know when i when i see his career i feel like he's taking that same path that like jan and joshua took many years ago and like i think we'll see a lot i mean he's already done a lot for the field but i think you know he's just getting started he's very early career and so i was really excited to work with him and so i got an offer to work with him and then at the time i hadn't started working with yon yet and um i was like yeah this sounds amazing so i took the offer and then basically by luck we got acquired during that summer and then i was like oh great so we're going to make this happen um and so you know we had that we had the decision of like do we got to acquire do we want to race the next round um but i think at the time we thought that partnering with the choir made a lot of more sense in terms of like being able to go into the next stage of the company so so the timing was right and so i started phd uh with kyunghan specifically and then um after about my first year like when i started i was working on yeah i was working on more nlp again so because that's all i've been doing for a while nlp and then started working on audio and then um that summer i interned at fair so that's how i got into fair um and my i started doing internship with with kyung hyun and then after that we started working with jan as well right so mostly because my research took a turn towards self-supervised learning and i mean i think jan's obviously one of the best people in the world at this and you know i think like we have very similar well i buy a lot of his views on subscribers learning and i've like since developed some of my own as well and and so um we started working together at that point and then um and so then i sit on a fair as a phd researcher so continue doing research both at nyu unfair um and uh yeah then that's when lightning was happening during all of us right so i mean like lightning started really like um when i was an undergrad at columbia like and i was doing research i had you know i was i had like when i so there was a gap between goldman and next investor i was doing where i got back into research well where i started doing research and that was you know after being a professional software engineer i was like okay how do i move through this stuff quickly and i started coding a lot of like i was i was using tensorflow back then a lot of stuff in tensorflow like to be able to like the main problem i was trying to solve was like if i have an idea how do i iterate how do i create a new idea quickly without having to copy my code over and over again right so if you do deep learning today you're like you create a file you have your idea there you're training train train you want to train a new idea you copy that file again and then now you have your other idea and then you modify it and then three months later those are out of sync you know and like now you're maintaining two code bases it's a mess so i was trying to solve that problem back then wasn't quite getting it um and then and then you know continued when we started next genves like we went the full production route on tensorflow so they didn't really do a lot of lightning stuff there it was mostly just like production tensorflow wasn't training a lot um and then um so once i got well once next invest left and then we got now i got back into nyu then i was like wait let me dust up this uh project you know and like see if i can get this thing to work again and so um so then i rewrote the that what i had before into what is now lightning well the early versions of that uh still still quite didn't get it right and then basically like in the spring before i joined fair i got the last abstractions which are like the trainer lighting module all that and then i open sourced it and then join facebook and then um and then after that it's where it got its distributed capabilities so you are essentially trying to solve this i guess it's called the glue code problem where um if you're building stuff sort of around the tens some people prefer to call tensorflow library maybe not even a framework you're sort of building stuff around tensorflow and um it's hard to sort of understand the distinction sometimes between the implementation code the the glue code and the model itself um and so you were essentially trying to solve this this type of problem through these projects that ultimately sort of pivoted into lightning it sounds like yeah exactly because like i i knew i started to see the pattern so i was like oh i i know i'm repeating the code but like i'm not sure what and like if you know what what i was scared of is like if i start abstracting stuff away am i going to limit what i can do later right so it wasn't obvious what i should abstract and so that's why it took like around four years to figure that out and the lightning api you see today is something where you know very comfortably you're not going to be locked into anything because it is very flexible because you know we did think about this for a long time and then and so as a researcher i need ultimate flexibility i can't go into a framework and then you know buy into it spend six months working on it then i want to try something and suddenly it's impossible and i'm like i just wasted six months right so that was what i wanted to wear with lightning and um and it's also why i didn't push it too hard in the beginning because i was still unclear if that could happen right so you know when i open source it i didn't really announce it like you can you know look at the repo like it was there but like i wasn't talking about it because it was my own project um when i got to facebook and other people started using it at fair then i was like oh interesting okay well yeah let me see can you do your work on this without getting limitations and you know they did run into limitations for a few months and we fixed them we got rid of them um to the point where like around august i was like huh turns out that like you can do vision you can do nlp you can do audio like none of this stuff is blocked so yeah i guess it's getting more general and uh we're at the point now where it is pretty general like you can do rl you can do graph networks you can do whatever like i don't think people have run into something you can't do with lightning yet um and if they do like we'll fix it pretty quickly so and can you just so that our our listeners understand and so i understand um can you explain the relationship between lightning and pi torch um like what what do those interfaces look like and what are the correct abstractions to think of there yeah for sure so i would consider pi torch like kind of equivalent to numpy right so you have raw tensors and you do operations in those you also have autograd so pytorch is like autograd and numpy but like you know like a very high speed accelerated version of numpy so that that's what pie touch gives you the raw ingredients right so for example why does scale earn exist because how many times did you take the same svm and coded a numpy it's just boilerplate right so there needed to be an abstraction on top of that so pytorch gives you those components but it doesn't really structure your code it doesn't really give you the higher level structure it gives you layers and that kind of stuff but it doesn't really let you figure out how to put them together right so you still have to figure that out um so it gives you ultimate flexibility but with you know so always a trade-off with ultimate flexibility you also get code that is not easily deployable you also get code that is not easily shareable you also get code that's you know has a high um like capacity for bugs because there's a lot of code like a lot of surface area to touch and so it's a trade-off right so you get to move faster ideas but maybe you can't deploy them quickly so that's a problem so lightning basically says okay well let me remove all the spoiler plate which you're going to repeat project the project and let you focus just on the code that's going to change between projects which is the computational stuff and then how the how models interact with each other so it abstracts away all that stuff um and it wasn't obvious like i said what to abstract away like you know was the optimizer supposed to be objective like you know we didn't know that um but we figured it out and we're still learning what what could be in is not abstractable right um we'll abstract something and someone's like oh i can't do this now we're like okay too much put it back right so so lightning is you can think about lighting it's just um it's it's literally like um like a style guide for by torch basically so again you're it's like a hardness you're just writing pytorch but you're implementing an interface right so you have a method you write your code you have another method you write your code and that's it but by that structure you get to remove like 90 of your boilerplate and then you get to scale out across whatever hardware you want across all the clusters whatever you whatever computational needs you have and it also lets us embed like all the engineering best practices so lightning has over 400 contributors today from you know fair google there's yeah there's definitely google people there um like microsoft i mean everywhere top companies top research labs people who are putting their code in there um because they use it right and they use it for stuff they could use it for research production doesn't matter but you're getting the collective knowledge of everyone so you know like basically you're not going to mess up the simple things that like you know the way you write your for loop and pytorch might be slower versus like a super experienced engineer who knows how to do that much faster so you get those benefits while maintaining control got it and so um how did the adoption go from um you know where it is now sort of walking backwards from these early days at fair you you saw you brought so basically you brought this open source tool um into your job at fair um you know your probably started to use it you guys started to improve it um like what was the next sort of hurdle like in traction adoption that you saw happen like when did you have an aha moment that um sort of the rest of the world would would maybe start to use this as well i mean you know i don't think there's like a a silver bullet for adoption right that's that's what uh everyone's you know we're all looking for that what's that answer like oh these seven words will get you adoption i don't think that exists like what what exists is you know hours of coding and taking every single feature and every bug and implementing it as fast as possible as best as you can testing your code giving people tutorials and examples and it's just a grind grind hard for many many many months right like that's just what it is so either you're willing to put in the effort you're not so what what drove the adoption is like i have very high capacity for output and so yeah i was coding non-stop for like a year right and it wasn't just me but it was like 10 other amazing core contributors who just dedicated their time to making this and improving it so you had you know 10 11 people around the clock working on this stuff not and dedicated not to mention the other you know 200 people that stopped by fix a bug here and there did whatever small contributions so you know by sheer will i guess and force so the community started to discover it more people started to jump on board um were you giving conference talks about it or um how did it sort of pop up in the wild outside of facebook i mean we would do tutorials and stuff right so people started using it but i think the key is when something is useful to people they will drive that option right that's the key like if if it's something that you can use and then you will go ahead and create your own stuff and you'll create your own you'll enter in competitions you'll carry your own tutorials i mean like you we've all done this we all run some into such a cool tool that we're like man i really want to tell people like how awesome is this thing and you write something about it right so it's a little catch-22 like i guess i would say focus on the product focus on the um building something that people love and then you know the growth will come i guess well that's definitely the yc model isn't it um build something that people love and uh and that that usually wins the day in the long run um but still at the same time there there are tactical things that teams do and end up doing to increase adoption and to you know establish feedback loops with their users and you know there's other tactical things in the distribution side that that happen whether they're intentional or not sometimes they're not intentional in the community just pick something up but um yes it's interesting to hear your thoughts on obviously the importance of of the the product and the and the usability of the thing itself which is um of course i i wouldn't disagree with yeah but you i mean you bring a good point is the the tight feedback loops right if someone submits a bug and you wait two months to implement like people are going to have this interest right yeah for sure um when did the core contr so did the core contributors around the project sort of grow in step with just the overall interest in the project um i'm sure you started to see you know pull requests pop up from orgs that you didn't necessarily know were using lightning did you have to nurture those contributors in any special way yeah i mean i think you always have to understand contributors right so you i think part of the job is to allow people to contribute as much as they can right like you should facilitate that through process documentation all of it right your builds the speed of the build everything else so i think it's also clear when like people are really into a project you'll see their behavior you know they're submitting prs left and right they're super active and like i think those are awesome candidates to start working with and you start working closely and like the way the lightning works is you know we we see your behavior we look at your code and we think you're a good engineer and then we kind of let you do that for a little bit see if you continue and if you do then we'll put you into like a like a tryout period basically um which will last you know multiple months and if you uh if you make it through that um and if we still like you then you know we all decide if we want to take you on board core or not and this is uh this is sort of the the recruiting process for the the pie charts project you're talking about or the the lightning project sorry yeah and like i mean a few of our sundown grid so we have a bunch of those core people who are not full-time employees right so they're working full-time on lightning as well well this is an interesting you know transition and um i think it you're right it speaks to the power of having a community um around an open source project because um not only you know does that community sort of become self-sustaining and makes the project better but um as you as the visionary or a visionary behind the project want to continue to to push the envelope and do bigger things sometimes those folks can can be recruited into bigger ideas um so so tell us how that transition happened um when did you realize that like at what moment did it become obvious to you that there might be a company to start here around this open source project yeah i think people are asking for a lot of features that it was just impossible to build on the open source and you needed other stuff right more like collaboration and that kind of thing and so yeah that can basically live in a product and for a while i was like how do i build this in such a way that people can benefit from it but at the end of the day like you know this is something that companies use labs you know individuals all these people and um it's it's they won't use it if it's not backed by a formal company right and they won't use it if there's no full-time support and full-time employees so you get into this kind of catch-22 as well because you want people to adopt but they're like well like are you a random project like i don't know right so so then that's the point where you know we fundraised and the money went really towards that right is to hire the people to be able to build the tools for the community so i mean grid is building tools for our community um and then obviously the broader community as well for machine learning as we expand but yeah like everything that we do in grid and all the features that we roll out are driven by our community got it and and so so i get the the notion of having a corporate um entity to sort of back the future development and consistency of the open source project did you actually offer support um like paid support for for lightning or that that you didn't necessarily go that far uh yeah i mean we didn't like we don't really offer lightning support so um support is time consuming and draining on everyone so we uh we give a lot of support in slack so we help people and the community provides a lot of support we have considered support for a few key companies that we're interested in working with but generally yeah we haven't really been focused on support got it got it and and you're right and right now like there are thousands of companies lighting today and you know they find they find us for anything key they'll ping us we'll solve the issue or something but um generally a community is pretty good about self self servicing this stuff as well and how do you measure the size of the community or the interest sort of the traction around lightning like what are the main kpis that you think are useful as a as a sort of a instigator behind the project what do you watch on a monthly basis yeah i mean that's uh yeah this is really the art of open source right we're all trying to figure out how to gauge that because you know we i mean all these things are like github stars but github stars basically tell you about popularity they don't tell you anything else um downloads i think tell you more about adoption than anything else like people downloading things but downloads also hide the fact that you know you download something seven times a day right or continuous integration systems download things like 10 20 times a day so download is not a user it's you know some ratio um so yeah it's hard we're always trying to estimate that i think when you have a product with analytics it's a lot easier but in open source like we don't have analytics on this stuff yeah i tend to get excited in in in the face of the the weakness of those metrics that you that you just mentioned which many of our are common kpis for folks um i tend to get excited about you know the the sort of breadth of the contribution community i think that sort of shows the not just the broad appeal of a project but also of course it's a measure of folks willingness to contribute back which isn't the same thing as widespread adoption but um to me it seems like there's something there around um a strong breadth or growing traction the number of contributors you're seeing across the open source project but that's just that's just one one of my uh thoughts well that i mean that's um that's partly that's that's one part it's just i i do i do agree that that's an interesting metric but the problem with that is that it also depends on the code base of the product and the project right if the thing is super easy to contribute to yeah you're gonna see a huge ton of contributors but if it's complex you're not gonna see that right so i i don't know like i don't think that that's really uh it's one measure but it shouldn't be the only one for sure yeah well interesting to see the open source world continue to move forward and as more companies are started and more founders have these kinds of the discussions um you know hopefully we'll all get smarter together about how to quantify some of these things um and of course the vcs want to know the vcs want to know when the time is to jump in and to fund that project like do you fund based on github stars or some other um you know assortment of of random statistics right so these these are these are all interesting questions probably uh yeah adoption obviously but yeah i don't know that's a good question i mean honestly like it's interesting because i don't think i mean like we had what hortonworks and then spark hadoop those people they monetized successfully and they became big companies but we haven't really seen other ones i think from the machine learning world that have done that right so we've seen it in databases of course like and cockroach and all these um but we haven't seen and then there's i guess at databricks they've done the work as well yeah but like we haven't i feel like there's still a ton of room so you know we still haven't all figured it out and um and you know there's like a class like you know i call it like a class class of 20 you know 20 or 20 19 whatever it is but we all came out and we're like around and we're all doing stuff we're on the same space but like there's a big space i think like we're all trying to collaborate and try to figure out how to make that space bigger and grow the pie for everyone and so i hope that we all figured out because we could set a good precedent for other people who come after us right to figure out how to monetize as well let's talk about that because i think maybe you know one of the concerns of of some folks who watch these markets is a more nuanced question around how quickly is deep learning growing and i know you faced some of these questions um when you were preparing to start grid right um how do you how do you think of the adoption of deep learning as a subset of overall machine learning like how do you sort of um explain that and and how did you know that it was the right time to build something that was specifically in the deep learning category i mean you know i think um i don't know if it's just new york but i think there's always a split between data scientists and machine learning people right like i think when i think about this the difference i think about data science is really focused on you know like user behavior analytics that kind of stuff some prediction models sure but then you have the machine learning and deep learning people who are i think just doing different jobs right um and so if you speak to a data scientist they'll talk about regression and svms and random forest like that's all they need and yeah it's probably true for what they're doing but there's a lot of other stuff that you can use those models for right um that's one but second mathematically speaking those things are differentiable right like linear regression's differentiable source logistic aggression you can approximate random force um and you can approximate svms as well right with gates so you can you can use autograd i say that because if you can use autograph you can differentiate it's a machine learning is a deep learning thing right you know when we talk about deep learning we're talking about autograd basically so you know like in lightning we have linear regression logistic regression implemented and i can run that thing at scale and image net on like 200 gpus right why because it's backed by pytorch and it's backed by a deep learning library so when i hear about people not going into a space or getting too deep learning or not i'm like you're already using it it's just you're using it at a smaller scale whether you call it a regression or logistic aggression you are using it you know you strap on three of those legislative aggressions that's called the three-layer neural network right so like um so to me it's all the same like if you're looking at it from the math side so i think that those people um you know if they're not using deep learning today like they will because most of the top most of the state-of-the-art models today are deep learning now we all of us in the deep learning world also understand baselining right so we're not going to build the model until we baseline it as well because i want to know how much better is it than a one-layer neural network uh linear regression a random force but there are baselines at the end of the day right and based on something like i used so far um whereas with if you know what you're doing and you tweak your neural networks and you increase the data you get more of the data you're able to generate a lot more so i think that the corporate world is a few years behind they're catching up hard and the ones who are who haven't caught up are feeling that pain because the other startups and the other companies are already there and that they're they're super behind right but they will catch up and um and so i think like the last holdouts there will uh will transition pretty soon so so so you're almost saying that there's a maybe an artificial separation between the notion of my job as a data scientist and the notion of my job as our potential job as a deep learning um uh or machine learning engineer and and you think that those things like that wall will sort of break down and corporate won't have i mean i guess as you said before maybe there's this slight stigma that you know that that prevents people from moving from the data science to the deep learning camp as they they think that deep learning is is more opaque than maybe it is or maybe then it needs to be um i mean i guess that might be another sort of block that keeps them from sort of feeling like there's a smooth transition from one to the other but um it sounds like you you believe there's a lot of hope there and and ultimately that law will be broken down yeah i mean i don't know what it is it's like you know i think people i think when i speak to more traditional data scientists you know there's like a point of pride in being like oh i don't need a neural network that's a you know that's a hammer for what you're trying to do and you're like well yeah because maybe you don't know how to use that correctly right like i think if you if you can tweak it and get it right and that's really what grid is about and lightning is hey you don't have to be an expert now you just drop it in there and like it will probably work much better than whatever your baseline is right so that's what we're trying to do right is bridge that gap and say hey i get it like it is look i don't discredit that because it does take a lot to understand how to train a neural network and and like a random forest i don't have to do anything i don't have to tweak any hyper parameters they just works right most of the time um but a neural network you get one hyper parameter wrong and you spend ten thousand dollars in a month and you get nowhere right so i don't blame people for that but you do you should know what you're doing when you start using this stuff if you know what you're doing you get all the value if you don't then yeah probably stick to the more traditional models and and simpler things um but there's a lot of good tooling today that can help you get there without you needing to be an expert you know that's what lightning is doing that's the greatest thing as well got it awesome well um let's talk about grid um for a few minutes then um before we wrap things up um explain to us sort of the you know you've sort of um weaved in and out of some of the things that that grid is doing around pytorch but but what's the what's the big vision there of the company what are you trying to accomplish yeah i mean so so the focus of grid is really to help people move really quickly through their through the iteration process right so they want to try a bunch of things they want to see how it works they have data they want to use machine learning you know not not even deep learning just machine learning deep learning um and they just want to try a bunch of ideas as quickly as possible and see what works and what doesn't work right so grid is about that there is focus on that specific use case right so helping you get through those ideas as fast as possible um and we do that through giving you access to like all the compute you need you know scaling up across as many machines as you want um not dealing with infrastructure just so you can focus on the actual thing that you're running so it's about it's what lighting does for deep learning engineers and researchers which is like abstract away all the engineering so you just focus on the math right um same thing here so machine learning engineers data scientists ai researchers you know research scientists anyone who wants to use data and wants to use some sort of machine learning model to get value of that data that's what grid helps you do is scale that up right so that you can you can move really fast through those ideas so at the end of the day what i love to see is that you know people do their workflows on grid they do everything they need there and then when they're ready to go you know they do whatever they want with their models if that model gets published great if it goes into a production system great if it goes into some sort of self-driving car great like doesn't matter right but um i think that we can streamline all these processes so that people get better at like the human creativity part and if i'm not mistaken you're starting with the training process right so you have a cloud-based architecture that helps people train models at scale that's a i believe one of your first offerings exactly yeah so that's what you can do today so you can come to grid you know you can spin up all the machines you want you can train whatever you need on there uh you can debug you can prototype stuff you can you know manage that at scale i mean if you're if you're training on your laptop like you don't have a lot of issues but if you're training like production workloads um companies even large academic data sets everything there's a lot of stuff that goes into it like how do you scale up you know 1000 models across each model across 12 gpus or you know 32 gpus and get access to the data quickly like there are so many nuances that go into this and you know it takes teams of people to like really optimize this and and you know we've been doing this for forever like lightning we've been solving these problems for what like almost year and a half at this point two years most of that team has been you know both sides of the production and research world so that's the the great team today is helping you bring all that knowledge again to go to a bigger scale that's great you don't need like expert engineers you just need the data scientists and demo engineers who just want to solve problems and like you know use our tooling to to get that done as fast as possible and so the idea is to continue to sort of expand on this this notion this this key insight that you had early on in the in the lightning project around separating um boilerplate code glue code um compute um all these other things from the actual core sort of inference um logic and and math and algorithm itself is that is that fair yeah and we can i mean we can scale that idea beyond that right so these are things that will come over the next few years via grid so you know we have a very long road map and yeah i hope you're all part of it to see it come together but yeah i mean the division is much bigger than that and like you know there we can apply this idea to many things so great where can people find out more information if they want to get involved yeah so we're close to early access right now which means you go to a website grid ai grid dot ai you sign up and then you know you'll get some email saying hey you're on the wait list and then uh you know we will allow you on the products as soon as there's a spot on the waitlist comes available and then you know you'll be able to do what you need to do on there that's great and if they say they're they're a friend of pete or they heard about it on um on dc thursday then you move them to the top of the list right yeah exactly great awesome um well well this has been a really fun chat um your story is extraordinary um the way you've wrangled the open source community around lightning is really awesome um you know we're all dying to see what happens with grid and and track your success there um i just wanted to ask you like a sort of a parting shot um what's the what's the hardest thing about starting a company as a technical person yeah um i think it's putting your engineering mentality away um when you know like i i still code sometimes but it's really like pulling away from that right because if you're too deep in the code you can't really step out and like holistically get the whole company to align and run right so as a founder and the ceo like your job is to align the company get everyone moving as fast as possible towards a common goal so if you're too too deep in the code it's not going to happen so that's the biggest challenge i think for engineers is to separate that and everyone will say this to you and you'll blow them up for like a year you'll say no i can go to [Laughter] yeah there's all the other softer things right the the leadership skills the people management and the recruiting and the mark there's there's all the things that you have to do as a startup founder and um you know we engineers are are pretty self-confident in our ability to kind of multitask and um you know and all tap between things but when it comes to a company with real people and um and you have to sort of know how to express that vision and take the time to do it it's a it can be a little bit of a different challenge for sure yeah and i think also know your strengths or weaknesses right like you're bad at something yeah know that so you can hire for that yeah absolutely well well um thanks again it's been been great to have you on the show and um we'll track the progress on grid from uh from here at data council thank you for having me so just a couple notes for the community um before we sign off make sure to join us next week on march 25th we'll be diving into the analytics data storage space and we'll have venket who's the co-founder and ceo of rockset which is a very interesting company in the data analytics space so don't miss that um don't forget to subscribe and hit the bell button so that you get notified when we go live um in coming weeks and remember you can find all of our previous episodes of dc thursday on the playlist here on youtube
Info
Channel: Data Council
Views: 280
Rating: 5 out of 5
Keywords: data engineering, data pipelines, data catalogs
Id: wuOfBO-Rjaw
Channel Id: undefined
Length: 49min 21sec (2961 seconds)
Published: Thu Mar 18 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.