Massimiliano Genta & Avinash Gopal - Metabob an AI-assisted tool for debugging Python code

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

all right so uh okay the next next talk let's see next talk we'll talk about camera bob uh an ai assistant tool for uh the bike like a python code so uh did they take it from here the boy is yours all right uh would you like me to like i mean i we want to keep it as just informal and q a base as possible we don't want to make it too commercial if you guys want i can give a very quick introduction like a five minutes power point we don't wanna and then we'll we'll keep the rest it's just a formal discussion um all right can you guys see my screen uh we see now yes also all right so tiny bit about us um meta bob started as a project about two years ago uh we started both and i we're pretty active open source contributors we started working with nsc lab which is a japanese company at princeton leading a ai lab with the goal of really finding a method that could help automate debugging and the code review process we are located in silicon valley mountain view we have right now about 10 12 members again mainly heavily engineering um as much we are initially we were part of fantasy lab we then spin out and we are now fully independent we also went through fuel accelerator alchemist is one of them which is a san francisco based accelerator and we're also currently part of the netapp accelerator which is like a program targeting deep deep tech companies all across the globe and uh we're doing like a pro a project with them a tiny bit also about myself as mentioned i've been like entrepreneur an entrepreneur my entire life i'm originally from italy you guys probably can get it from the accent i moved in united states in silicon valley about 10 years ago since then always starts the project mainly in the ai space i also along with avi we started a project focusing on the open source community called class where we created a framework for governance that could help us project to be run more efficiently and also help contributors to monetize from their contributions um avia you want to tell a tiny bit about yourself uh yeah so my background is mainly in engineering research uh so before uh working with masi i spent a lot of time working more in the aerospace sector uh so i was doing like airfoil design and safety and a lot of the methods that you would use for that the numerical methods in order to make approximations carry over fairly well into the more machine learning space because they do use similar processes like a similar conception and how uh they operate so basically since you're just trying to identify a pattern based off of the data surrounding the pattern uh you can just do it in reverse so it landed itself very well to working more in the ai tech space from there i also did a lot of research into generating different sorts of uh tracking and area mapping systems for very various types of semiconductor vehicles and then after that we started working on a project together after joining a hackathon and basically that was for helping open source developers uh better understand how their contributions are affecting the overall end front and how they should be in many cases compensated for that all right yeah so i'll tell you a bit about what we are trying to achieve uh obviously we all know debugging is an extremely time consuming and tedious process affecting mental health quite a lot so and currently what most people do like we use a rule-based static analysis tool linters and so on but though the main issue with that is like they heavily rely on the individual developers ability to define problems through simple rules and really that's why we created the metabolic to we built a semantic understanding model on top of a static analysis tool to enable developers to identify the calls behind what we call logic based or complex based packs and again to do that we build a semantic understanding model mainly based out of high quality open source repos along with some reddit and stack cover flow uh company supes company design standard program analysis as well as manual review historical results and the combination of all these data sets enables our aijs to learn the best programming practices and to generate ready-to-use code snippets to automate the process improving um the developers productivity the method currently we have around the 70 to 74 accuracy with right below five percent false positive rates and obviously like the model improves uh through user interaction with the platform now really what makes us unique right now is the type of bugs we're able to detect again current solution mainly rely um like as a rule-based static analysis tool are only able to detect problems local to line of code such as syntax fixing style typos and so on while on our end because of how the the model was trained we're explicitly learning able to learn how this rule function and to detect the logic based bugs such as all type of performance you should determine user modules or race condition um problem arising for multi-threading or process sharing um i will actually let tabi do that this part but overall uh the backstage right now we are quite strong at identifying and detecting are all related to numpy pandas and sky kit type of type errors we currently mainly focus on ml developers uh data scientists as due to our data sets those are definitely the type of issues that our model is the strongest to identify um yeah and wait a second sorry about that all right so uh we're back out of the way uh okay perfect yeah sorry about that yeah you can go ahead uh we can go over our quick product demo and as we show you how the product looks like we can again feel free to ask us questions as we thought again we want to keep it as informal as possible this is not like we're not trying to sell anything just to get your feedback so um yeah if you already have a question just go ahead and in the meantime maybe you can show the product and go over and tell a bit more we came up with it yeah sounds good so uh while i go and do that uh basically what we can do also is uh depending on how you guys want to proceed i i can talk more about the product and like how we can help you guys find the participation for your projects or alternatively we can talk a bit more about ai and ml in general and also programming in general and basically uh get a bit more information about space and i can speak to the challenges that we face called trying to build the product and types of issues that we encounter so whichever one you think will be most interesting for you guys yeah it's like yeah be interesting how you went about it like how you start such a thing like it seems like you're pretty complicated like uh you really like you have to have a code that looks for code right how would you even start such a thing yeah uh that's definitely it's still an ongoing process actually but uh it's definitely something that is very uh like it's a very noisy domain i'm gonna have to say uh probably one of the noisiest domains that you could probably encounter simply because there's many different ways of achieving the same result uh and while under the hood things may end up being fairly similar in how programs are structured uh the way the patterns are constructed on like the developer side can vary quite a lot and then also the problems themselves are very context-based so they're not things that are directly related to a specific area or specific type of uh like issue that you can just say oh yeah it's here every time there's already tools like that that do linting or static analysis in that sort of sense so you have your pilot or your flight gate or the rest i can find like big uh very obvious patterns that are not necessarily bugs per se but you could obviously make it uh find certain types of bugs but are more just sorts of areas where uh it's not necessarily best practices to do a certain sort of program in a sort of way and that's kind of what the existing solutions are uh to find different sorts of blocks within uh or different sorts of problems within your source code but basically what we would need to do in order to create an ai that would be able to do this is be able to implicitly learn exactly what code changes are and what current changes need to be made in order to make adjustments uh to the codes in order to help resolve these exhibitions so speaking to that uh as mossy had mentioned earlier basically what we did was collect wild data from open source repositories so that includes places uh like github bucket and gitlab along with uh stack overflow as we mentioned earlier and what we were doing with that data was not actually looking directly for the code changes themselves in fact what we started out doing was looking at the documentation surrounding the code changes so i don't know how much experience most people here have with git in general but there's a merge and pull requests that are frequently made open to projects as well as a different commit as far as the commit history as well as the comments for both of these and for the more well-maintained products which are really where we want to do most of our learning from because they are most likely to have uh either a stable set of standards they want to achieve or a stable set of performance uh like guidelines they want to meet so uh for these larger sorts of opportunity projects there's a very robust uh tracking of what changes are made and why they're being made and what we're doing uh at the front end was basically just uh was going through this data set and then trying to learn why are people making code changes what are the underlying reasons and we actually used a machine learning methodology or machine learning technique called lba for latent gear allocation to do a two-stage topic modeling on this so for the first stage what we were doing was we were going through all of the data coming in so straight from the source we did some basic tokenization to remove some of the more uh specifically identities like usernames and the like and then keep things that are common across multiple different python projects and then from there basically what we did was we took that information and we tried to determine whether or not a certain uh comment was referring to a change that was due to a new feature or a documentation change or like a virgin update so let's say like numpy went from version 0.1 to 0.2 you now need to update your package that's a new one so we tried to remove all those from the set by identifying which changes were related to those without having to look at the code itself because i usually find multiple changes together so if there's like a version change they might also change the version in a couple of different conflict files in different locations and we didn't want to like those right so without having to look at any of the source codes we were trying to identify why a change was made uh so if we could filter out all of the noise that we wouldn't want to listen to because we were trying to figure out exactly why certain bug fixes were made we would need to remove all the new features all of the documentation and all of the uh the other version of it and keep the things that are the security improvements uh security fixes uh performance improvements and and bug uh and then with that remaining data set we then need to filter it out again into a subset of categories based off of why the code change is made so we're trying to identify basically uh from the surrounding documentation is there some underlying similarity between uh different categories of books or can uk classify about abrupt categories for the most part we were able to do that it's definitely stronger in some areas in another area in fact i actually think what we're doing right now that is less than ideal is that we are over classifying uh bugs into very specific groupings and we're neglecting some of the more broader similarities that could extend themselves to multiple different types of bug categories at the same time but they just happen to express itself as one type of issue in this particular instance due to the contextual content due to the contextual factors surrounding the usage of the code so basically what we're actually doing right now is accounting for that using a uh a method uh that will allow us to do like a semi supervised uh training method and also have additional people to help us out and do like semi labeled like harshly computer labeled partially human label uh data sets that we can then feed in so basically what we're doing there is we're now more uh accurately specifying which categories uh above we fall into and that's that's just the data cleanup part of it so uh so you have maybe like a like like an example like for example like like a real example like okay i have this piece of code and then uh you're applying your methods it's gonna tell you like how to fix it or what the problem is they have something you can show like something that's gonna like a concrete concrete example that [Music] so uh basically uh so what we do is uh we interface with your scm provider so we are currently in the cicd stage of development or rather where we'll be placed inside of your development pipeline where we're near the end of the process so it's after you commit your code to uh your your provider or like after you commit it to github uh then we'll automatically run an analysis on it and then be able to identify certain types of problems according to our detection criteria uh and you'll be able to view them either from uh within github what we do is we send you a a report uh if you request one and it will provide the list of uh particular bugs this is done automatically if you configure it but uh for the most part you'll just be redirected back here into our ui where we'll show you basically a list of issues that we've detected on this left hand side along with where they're located between their code base so for example you can click through any of these and just find a particular type of problem so in this case uh they are using a in this particular library uh it's actually like a big data uh uh lookup table essentially and what they do for this uh this library is that they use a special class to generate hatches and like stores special hashes that they're using to look up other information based off of what those hash values are and it's a fuzzy matter so it can do it in log lifetime and that's why it's called uh and that's why it's set up here like that so uh basically for this operation uh they are reading in uh in a different location in the code base they're reading in that this data from external files or from uh public api essentially that they can like load things in from but in this entire flow they they never explicitly make sure that what data they're adjusting is properly of the same type uh as what was generated and what we should do would you mind increasing your font size it's just hard to see please that's fine thank you so much so uh basically uh what we what we can do with our actual bug detection model which uh i can get into explaining a bit more now uh is that we take in the raw source producer and we actually use a a technique called graph retention uh our model is a graph attention based classifier and it's something that's very new but essentially it takes in a graph as an input matrix essentially which is which contains all of the uh the edges between varying components along with a semantic vector for each uh component itself and what we do is we we parse our code base into its abstracts syntax tree which is just how the code is structured and uh how the interpreter python interpreter views source code it sees it as like a a directed graph of different uh functions and different components i call other components as well as have other components inside of them so like a function will have a list of parameters and then in each one of the parameters they're going to be names and these names refer to particular things that are then that would be needed to be loaded from memory uh under their certain key so your your interpreter like has the ability to parse the code base into the structure and the next view so we use the same uh basic structure as like a base but we actually add on quite a bit more additional information so we add on both semantic factors that come from the context of how something is being used and also how what it's called and where in other locations it's called the same thing so we're looking for different words and different specific types of uh features within uh each set of abstract syntax recomponents to group together into a single semantic vector and then we also take into account the code flow which is essentially how the program is going to be running together after being executed so we take we keep track of things like loops and things that would load things dynamically from memory later uh so we can store all this uh context information essentially about how the program will be operating and we pass them back as in the vector into our into our mind uh so what we're actually doing is we're identifying different regions within the code base that have certain structures that has been changed and oh that they've been changed because of the commit history we know which reasons the code can change and we also know why they were changed because we did two-stage lda topic modeling system and we're basically predicting these are the regions of the code that will be changed in your code base and here are the reasons why and then we produce an explanation for for e20p and right now uh we are trying out using gtp3 for that but we currently use a modified version of gbp2 for the publicly available version of our mod so that's basically the whole ai pipeline uh so what we do for each uh bug detection that we want to find we we can look at different places within the code base so it's not just related to this particular structure uh because we're having a very a much more well connected view of how the code base is fitting together with other components and other modules uh what we can do essentially is uh learn which ones of these modules are most relevant to a particular type of issue or which ones these components within your own code base are most relevant to a particular type of issue or even to identify if they are relevant to being a particular type of issue and using that we can learn dynamically from how the edges are constructed which uh which areas of the codebase have a higher propensity to need to be changed and we can do this with a very well much better degree of signal to noise than other methods that don't have an attention-based system to boost us so i'm going to ask kind of a more crazy questions as your system learns more about the code would you see the future when your code is gonna write code uh yeah so as you can see right here we actually do generate little code snippets that can be inserted for certain types of problems uh as our end goal essentially here for our application uh is to become a meta programming tool uh so basically what we want to do is be able to take in requirements and then produce viable code that can fulfill those requirements uh there's already other tools and other ways in the space of doing that uh even for dvd three as i'm sure many of you may be aware when it was announced last year uh basically there were a lot of different things on on twitter and other places of people using it to generate different react components and different javascript uh components for like web presence so you would ask dvd3 to say oh make a button and then have it be read and centered in the page and have it say hello and then it would it would make something that does that uh two it's plus or minus on how accurate it was to get things exactly how you want it to be and it took a couple of tries and it wasn't like necessarily easy to use but it was capable of doing it in some instances and the issue with that method and also the newer methods that they're doing to generate code directly from requirements to solve certain types of problems so they'll be like here's like a programming course question uh and then have an ai generate the solution to that question is that for all of these methods what they're doing is essentially they are uh just creating entirely new bespoke snippets of code that just do that one thing but they don't play nicely with the rest of the code so while it might be good to generate like a one-off thing for a script or like it might be fun to generate a button uh they don't take into account the fact that code bases are very like they grow to do very large and expensive things with a lot of internal dependencies and they're not going to leverage those dependencies because they aren't aware that they exist in the same capacity so what our what a graph attention type model allows us to do is basically to go through and embed that information uh within the code base uh or within the uh the inputs and then use that to basically produce a transformation uh from one graph structure to another about how to hook up additional edges to generate new code and also have it use a function that you may have defined in a different module as of one of the uh parameters or one of the things to call uh in order to get some information back because it knows that when you call that function it will go through its own program flow and then return a certain type of information as that matches basically what would need to be generated it's better to use that than to generate a new thing right simply because it would still need to be maintained by human beings at some point so uh what we can do uh with our technique is basically enable us it enables us to being able to add in that information and then also use that information to insert other pieces of code from your own code base inside of a function that we've generated and that's what we're currently working on actually so it is something that's in the future right now we also use gp3 to generate these uh these code snippets and they do work all right but again it is more of a more of an experimental feature yeah are there any questions in the chat i can see the chat but yeah and so we had a question from patrick schultz here and he says if i recall correctly your model learns from some quote exemplary code repos um could you give us some examples of what those are and how did you pick them yeah so uh basically we are we actually taking all three postures we actually just weight them based off of how many contributors they are how many corpse there are for a particular project and how active the project is in terms of like having a polar class and using a good structure for implementing or integrating these pull requests into either a master branch or a regular release pattern so these are the things that we're kind of looking for in terms of uh like metrics that we that we rate against so things that typically rank highly are things from the python software foundation uh for obvious reasons so like the request library uh numpy uh though most of the numpy is in c we do keep track of some of the uh the higher level api stuff that they have written in there uh and then also uh other like more well used libraries uh like strings as well and like just just walk because of my head but basically that's how we do the ranking and then for things that don't match that as highly we still train off the data it's just that because of how our data set is built we are naturally predisposed to over sample from some of the larger repositories for two reasons one because they are better uh repositories unhappy with more activity but that also means they have more code changes which also means they have more bug fixes or more performance improvements and more changes that are relevant to our use case and because we don't want to have our model trained too specifically for a particular type of project so uh we need to actually also include and potentially even like oh for example some of the uh the bugs that occur in projects that aren't necessarily the exemplary ones but we try and keep a ratio so that in the outcome of the training data sets that we have uh the more uh well-maintained repositories have a greater uh like share of uh the training data sets and the less will maintainers have the less uh greater share but we also sometimes are going for sample them so that they are still represented enough within the code base so that we don't grow to fit too closely to a particular uh code base to identify a particular sort of problem because that's always a risk uh with this test methodology yeah that makes sense thanks uh thanks for your answer um we have a question from chris morrow here in the chat um and he'd like to know if there are particular categories of bugs that metabob is like especially good at finding uh yeah so as a master show for a very short segment but i'll show again uh for a much longer period of time uh basically we are much better at protecting sort of these sorts of problems uh in particular uh what we can detect uh more cleanly are issues that are related to like mountain and other data science libraries like scikit uh simply because again there's a lot of them in the data sets a lot of people who are doing python are doing it in the data science machine learning space and they use things like numpy and psychic and pandas and then on top of that there's frequently a lot of issues that arise from not doing this in the quote-unquote proper way uh so there's a lot of performance improvements you can do in numpy that rather take advantage of how it's built internally uh that aren't readily apparent to somebody who may just be getting into python and is used to using the python list or isn't used to using the python list and assumes numpy also works in the same way uh even though there are like very different ways to handle both of these things to maximize performance from each boom so that's mainly what we can find and then in growing layers of less confidence we also have other issues that are related to like race conditions and multi-threading and multi-processing and sharing data between different processes that are launched from the same program as well as other issues arising from like web service frameworks and these are just very i guess you could say uh endemic problems to the programming space especially for newer developers so it matches fairly well in terms of what we call protect right now but really what we want to do is sort of extend it dot to have support for more general programming issues as well and that's uh what we're trying to do right now thank you so much for that answer um if you don't mind i'd like to follow up the question about um numpy errors specifically so i know there's a big push right now to get support into my pie the optional static type or for python to be able to annotate numpy arrays with at least their dimensionality possibly also their shape is metabond something that you know out of the box right now can detect shape mismatches between array operations uh yeah actually for some uh some shape operations or some state mismatches we can we can identify issues that arise uh from not uh ensuring that the shapes are properly correlated uh simply because usually what you do in order to fix this is adding additional pieces of code to basically either uh course perception to the proper form or make adjustments to what the underlying case is so because we have samples like that in our data sets uh we can identify that if you are doing these operations in a way that is unsafe or at least in a way that won't be guaranteed to work all the time because you aren't ensuring that the shapes are properly uh coordinated then uh we can flat we already flagged some of the as potential issues and we have the issue explanation being related to the shape awesome thank you so much um as always if you'd like to ask a question please feel free to unmute yourself and speak up and feeling a little bit shy and you'd rather have me ask the question of course you can continue typing these in the chat um i'd like to follow up with that one too um so in the work that i do um a lot of the bugs that we encounter have to do with data validation problems so we get some you know uh data type of particular values that we aren't expecting and you know after six hours of work we find out that there's a divide by zero happening somewhere because we weren't expecting any zeros to show up uh is this something that meta bob currently can conceive uh yeah so uh they're mostly uh i guess you could say clustered around issues with the web frameworks but we do identify problems like that where you are reading in different uh user different users as a from one empathy api but not uh necessarily going through the process to validate that it is exactly what you should be expecting later and then accessing these items without properly going through that process uh it's simply because we also see a lot of people move towards a more model or like a data structure-based approach to that uh using like a data class or something to coerce things into and then uh after marshaling it into one of these objects you would then pass it around to the rest of your components just to ensure that everything is meeting up in the proper format and because there's examples of that and the open source data center we can find problems like that that would require that kind of change awesome thank you so much i'd love to know a little more about the feature of writing our code because one of the big challenges of um true multi-processing not just multi-threading uh is things like data races and you can even get into hardware level issues if your machine's big enough um if you're getting like super computing stuff like that but um it would be really nice if this could even lead to some kind of easily scalable i guess multi-processing library um because if this thing finds that there are hot spots of errors that could guide the development of even just hand-coded human-developed um software to address those problems to be a sort of uh patch of patches if you will yeah uh i definitely get what you're getting at so basically even if we just provide like a heat map of like gears inside of the multi-processing library when it's being used these are the the occurring points essentially that are going to be causing the most issues down the line simply because they are frequently needing to be changed all the time because they keep causing problems especially at scale and then like a human being who's like working on that library can uh do additional uh like take additional care to make sure that those are are safer or at least more performant for those use cases right oh no i have something else in mind i mean to allow human beings to take less or even no care about the details of the implementation just like how we don't worry about what the assembler or compiler is doing when you write a c program uh yeah so again as mentioned that's basically what our sort of angle is to basically uh what we kind of want to do is allow developers to focus more on the broader architectural goals of whatever they're developing so as opposed to dealing with i guess you could say uh it's a step above boilerplate but it's the road code that you would write in order to do certain sorts of operations like for example getting jobs to run a multi-processing system uh you would need to make the same sort of like structure the same sort of architecture and then put things into there that are application specific but what we want to be able to do is say hey uh just make this multi-pressing structure i need to have like a number of workers as a function of how many processes are available on the host machine also can you run this function a couple times and and it would just be able to do that directly and just fill in all the gaps and then also use the appropriate methodology to uh to both spawn the processes manage them handle the data passing and through all that so yeah i think uh we are in agreement of what we want to be able to do obviously that's kind of far off from where we are so we're going to focus right now on like trying to fix the small problems and then work your way i don't think that you're um you're really that far off from it because there is such a library already it's called desk and it has a really nice object oriented lazy evaluation where you do just that it even has a diagram printer it's great um i think the the real question is um could you look at a piece of das code and say have i done something that isn't obvious um that that is going to lead to some sort of kerfuffle and the other question is you know when you're looking at more than just a patch of code when you're looking at like systems of you know this pi file this pi file this pi file you know larger assemblies can go wrong uh because maybe there's a blocking issue somewhere or something like that and so um that's something that i'm a little more interested in because i think that with a bit of practice you can learn how to write good code and you can use things like that to make sure that it's that it does just work but you know that's sort of only the beginning of the uh of the problem yeah uh it's definitely a very exciting space to be in that's kind of why we wanted to take on this project uh and we feel that right now also uh there's a lot of new research in this space especially on both the ai sides as well as the uh the non the more analytical sides to determine uh the best way to both uh i guess you could say just criticize the problem and make it so it's very explicit uh what how things are going to be structured and also where crumbs are occurring and with all this bandana and all these new techniques there's definitely a lot of room to grow and really make something that is uh phenomenally useful so what would be your improvement on to death uh our improvement onto that particular operation would be uh basically what we would what we would want to do is uh it's i'm not too familiar with that so i can't give you any specific examples about what our tournament would be from what i understand from your description of it right now uh basically it is a way to do uh essentially lazy operations on uh on a like rough graph structure of what the code base would look like but basically what we would be doing is not specifically the analytics uh part of it but more so we would be building out each uh each component itself uh yeah that's right yeah it builds that for you you create an object that's lazily evaluated so you say okay i want you to do this and that and the other thing you tell it how many workers you want and then once the whole thing is built and you do it all in native python code once the whole thing is built then you just hit run and it just works oh okay yeah so uh basically our improvement on the ask is to not not being a focal desk we want to be able to do this for any not just uh particular types of python like data science libraries like our end goal is not to just only have the ability to like automatically track for like numpy arrays or doing anything uh specifically uh we want to be able to just generally be able to create code for any arbitrary operation uh obviously that's a very much larger goal so right now we just want to fix particular issues not with the scaling necessarily in numpy or scaling necessarily in clusters but also finding other sorts of issues new data validation or data cleanup or part of the etl pipeline doing different sorts of data manipulations there we can find and fix those issues as well so uh if das works very well for scaling up on the uh on the side then maybe what we'd be doing in terms of improving our desk is getting more projects to put desk is a requirement for the project by giving suggestions that say oh don't use a numpy right here directly get that and use this desk array because it's just going to be more scalable for you and that's the explanation and suggestion that we provide not not necessarily to rewrite it in the numpad like we're going to take whatever is the most effective solution to the problem or at least the goal is to always pick the most effective solution for the problem i gotcha so yeah um all i was going to say to that was i think you should definitely be looking for really thorny problems that have to be done a lot and really grueling um might i also suggest javascript uh we're actually looking into javascript right now on the like secret development side uh we're working on a java version for some of our other uh customers that we're speaking to simply because they do have quite a lot of java code and in addition to that the second highest rated or second highest requested code uh language support would be javascript so we're we're also going to be working on javascript later this year and further on july but it is definitely something we're interested in supporting as well yeah you can fix ye oldie java you know freaking thorn bush and uh plumb the depths of madness that is javascript i highly recommend watching wat all lowercase um i i think you'll have made a lot of people a lot less dependent on various substances some of which are not even legal yeah uh hopefully we can improve public health uh maybe that too yeah i i do have a question so have you thought like uh some like i i know you said like the way it is right now it's practically at uh it sits at the end of some uh like cicd pipeline right so if you thought it might be having some kind of a like ide plug-in or something that is more immediate like where you like it's practically as part of your programming uh like flow like you you change something and then maybe plug-in is gonna say hey you you you've done something at all maybe even before if that's not possible maybe uh like let's say when you [Music] you do your com commit locally then it's going to go and say hey by the way you well maybe not you know it has to be maybe i'm thinking something like more immediate like something like more immediate yeah so like ide integration so something more like kite or like a linter that you would normally use like pilot or something so as soon as you type something uh or as soon as you finish like defining a function uh it'll go through and say hey whoa whoa you need to you need to fix this uh before you already committed already like yeah you already embarrassed yourself that you yeah well we don't solve embarrassment right now it is something that that we will want to help developers avoid in the future by putting it early in the development pipeline so basically closer to the id uh the we have some issues with that mainly because of how we're doing the influencing uh we need to go through and like have code the way our model works then what makes it unique is that it can see a broad view of your code base and how it's connected and use that to produce uh the inferences but if you're doing local programming you would need to like upload most or all of that over and we need to keep track of all these states and it's more of a security issue that we want to avoid uh kite actually run into like a big issue with this a couple years back about like uploading people's code to their servers to do machine learning on it and rather than yeah maybe you can encrypt your models and then again make them part of the ide if they're not too big i don't know maybe that's another way like yeah it is kind of more like federated i think it's called the federal learning the from what i heard where you instead of bringing the uh i think practically you so so one of the other ways that we could do this is to like have the model run locally it's just kind of it's still it's too big right now so we need to uh our goal essentially is to refine the main meta bob model to a much higher degree than it is now and when we plateau or get to a very good spot there we'll probably see if we can make a benabob junior that can run uh locally uh and then have that be something that's running on your id uh that can keep you up to date and like find the the more bigger issues and then afterwards you can still use the more robust version of metapod later in the city pipeline in order to get the more intricate issues that yeah you can you can produce some kind of like uh compressing methods like student i think the teachers to the models like when you you when you practically reduce some of the uh small connections but you can still get almost the same accuracy might be might have maybe like by a few points if you uh you have a few numbers lower accuracy but uh you might still be able to maybe compress it by using something to do i think it's called the uh teacher student model where you use a teacher and then you have a student that's actually a lighter and then the student actually learns um learns practically what the teacher knows but it says he ends up being a small model and he might be able to bring it locally yeah that's that's kind of where the more ideal way of doing it than to upload anything again for the security reasons we take it very seriously we actually try not to store any user data for any amount of time that is absolutely required yeah that's like if you bring the model locally and say you encrypt it there's there's ways to encrypt the models you can you can you avoid the that problem where you don't send any data down down the the stream but if is let's say if it's part of the ide and if this doesn't if it's not that big that's the other problem if it's not that big then you can practically uh avoid the the whole issue of sending data down down the uh the pipe where they have some security issues yeah there's definitely there's definitely methods to do it and it's pretty exciting actually in terms of what could be possible yeah any more questions on the chat or so i i have kind of a weird question and i was waiting for a segway to ask it but yeah you did mention kite and so kite is a an ai code generation tool have you tried running metabob against uh a code base that was heavily generated by kite that is a great suggestion i'm going to try it uh probably over this weekend but we have not done that now uh i'll see what the results will be actually that sounds very interesting yeah i i was the weird kid that would pick uh chess ais against one another so yeah no that's definitely a good idea i'll i'll run it as soon as i am able to see if there's something if there isn't like a very good kite uh only repository i might just think again do you happen to have any buggy code to show it show how it runs at all not in like a live live demo but you can install it on metabot.com it is free uh so feel free to try it out if you so any of my code i get tested on yeah uh any of your pythagorean currently because uh again we we only support python right now uh we are up to date so everything should be fine uh when 2.10 comes out though we'll also be supporting that so it's not that that's the other question as you said python you're working on javascript in the background are you all going are y'all just kind of go as requests are needed going into c plus plus or other things like that or are you just sticking with this right now we kind of don't want to branch into too many languages right now uh because again we do kind of want to focus on improving our overall detection capability uh so for the other languages uh basically while our strategy there is uh we're doing java right now mainly because we previously agreed to do java but then also we are trying to take in the learnings that we can get from how our model generalizes across different languages and whether or not we can use what techniques we can use across different languages to see if there's additional detections that we can support or if they are too distinct in terms of how they're architected or what the common titans are that they won't overlap too much uh and depending on whether the lines are there we're trying to optimize our workflow for either additional languages or or how we're going to be moving forward so we intend to support java and then potentially javascript down the line for other languages like c plus plus uh we we may or may not support it necessarily uh c plus plus is definitely more likely to be supported than something like go or rust which already have a very robust and very standardized sets of how you handle and operate with them and because of that we don't offer that much of a huge advantage uh at least until we can get a much higher quality level of uh detections over the very intricate very repository spanning types of issues uh which hopefully we'll have in like a year or two so that would be a good time to start going in there but but we don't have like concrete plans for i have a suggestion um what if this code could somehow figure out how a certain program works and then transpile it into another language because one of the big complaints right about um technical debt is that legit there is cobol out there that no one can ever touch because the people who wrote it are retired or even dead and it follows none of the proper programming uh best practices so the side effects of even the slightest change could be catastrophic if put into production and there's no way to test everything um there's old fortune 77 that still lives out there um and there are plenty of people who use c plus who would love to switch over to rust but everything is in c plus plus so what would like i mean this is sort of pie in the sky spitfalling like it seems almost like you could you could switch over that way and rewrite code bases like that to make them easier to work with yeah it would be really cool i don't i mean there's a there's some issues with using ai to do this mainly because uh like you can't do a direct translation for obvious reasons because like internally everything starts differently at least on the steps between the uh the language itself and the compiler but then after you get into machine code it's basically the same so uh the only issue there would be to go through and if you're going to be using an ai to generate code it's going to have some slip uh and that's unfortunately more or less unavoidable but that would be something that would probably provide quite a bit of value uh as we move to more i guess you could say robust methodologies when it comes to doing development uh moving the legacy code over there it would also have for maintainability just overall but yeah it's definitely something that's interesting but probably not the best use case for anyone at least from my understanding the problem oh what a shame because man there would be so many things you could do you could make like rust self-posting yeah uh you can do a whole other cool cool thing but uh like maybe in the future maybe i'm understanding the capabilities or maybe i'm just being a bit uh a bit too cautious that might always be the case i know but it's definitely something that i think there would will be a very strong uniform in the future all right well uh if there's no other questions it was wonderful having the time to go on here and gush about uh ai and our platform in general uh more than happy to come on by any other time maybe after i run that kite versus fed above one-on-one head-to-head competition uh we could stop by again i have a question before you do why why did you pick this name metabo that's a great question uh we sort of just we picked the logo first and we drew him and that guy kind of looks like a bob and we're a meta programming tool at least in the future we will be so it's mad above yeah i'm glad you asked that question uh jk as i mentioned uh probably one of the emails i sent to you i uh i really enjoy there's a sci-fi uh series of books uh called the bobby verse uh i yeah really encourage reading that is practically some developer dies and uploads his his brain on the computer and he wakes up somewhere in the future and he replicates all over the universe so and every time when i hear mera bob i hear actually the body first you might have that question it's good like we didn't even have that in mind but as long as it works out it's great so well it's catchy and now the people will remember the name just because of the story yes all right so yeah it was a wonderful being here and being on uh i think massey wanted to talk for a couple of seconds there was something able to not yeah i'm sorry guys i it was kind of misunderstood a bit the timing for the meetup but i mean he's a way better developer than myself anyway so it was better for him to talk um than me anyway so but yeah thank you so much guys for having us um we'll really appreciate all the feedback all the suggestions that's true thank you yeah this was great actually i totally enjoyed it very interesting uh work there yeah thanks so much appreciate it thank you guys like but it was cold stay safe guys and have a great rest of the day and if you guys want to connect to linkedin bounce like show us your project or anything um maybe if you can uh post the link to your website in the chat so uh we're going to stay over a little bit longer but uh yeah yeah i'll uh i'll do like uh that's my linkedin um that's my email in case anyone wants to connect again for whatever you guys project are doing um we always love to hear what other people are working on as well and uh if you want to check the website i mean it's pretty straightforward it's just metabolic.com but i'll also put it from the in the chat and yeah where's that i love you and uh wait wait don't go guys i'm trying to connect with you on linkedin thank you thank you your linkedin url right yeah that's my linkedin url yeah you know i'm saying you can change it it doesn't have to be so horrible i know i know i kind of like it like that though no that's his middle name are you making fun of his name yeah i guess so i guess so yeah the thing was more about the hexadecimal at the end of it yeah so we we were actually we recorded uh this so uh i'm gonna have to ask you for your permission to post this on our meetup website or youtube channel if that's okay with you of course yes no problem awesome thank you so much so i'm gonna stop the recording right now just so you know

Info

Channel: Austin Python Meetup

Views: 32

Rating: 5 out of 5

Keywords: python, metabob, Austin python meetup

Id: -r8gn3LvWSY

Channel Id: undefined

Length: 61min 9sec (3669 seconds)

Published: Thu Jun 17 2021