Chicory: Creating a Language-Native Wasm Runtime by Benjamin Eckel / Andrea Peruffo @ Wasm I/O 2024

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] okay uh welcome to chory creating a language Native wasm runtime so a little bit about us to start um my name is Ben I'm the co-founder CTO of DPO here is Andrea he's a principal engineer at red hat and uh yeah I live in New Orleans Louisiana he lives in Lisbon Portugal and for the past like six months or so we've been just kind of collaborating uh online over the Atlantic Ocean on this on our free time and uh finally got to meet Andrea for the first time first time meeting him first time doing this talk together so also he snuck some slides in here to surprise me which I don't know about so it's gonna be fun uh to keep you awake yeah keep everyone awake um yeah so let's get let's get into it so um first off why are we talking about the jvm at a wasm conference you might be asking yourself well I would uh basically counter the argument why are we not talking about it um It's actually kind of weird how little we talk about the jvm like I you know have heard in a few of the talks it's definitely mentioned certainly you hear right once Run Anywhere a few jokes and stuff and uh certainly there's a lot of you know issues with the jvm but you know the jvm was created uh with the same goals as wasm in mind really if you go look at look and understand its history but it has a 30-year Head Start so they probably know some things that we need to know um and uh yeah there you know there were previous talks that have kind of mentioned this idea of like learning from others and uh you know especially the jvm the people who are working on that so I would just say like if you're kind of skeptical about the jvm or you know you have sort of certain ideas about it I think if you're working on wasm at all you should take the time to understand it study it and use it even just using it and kind of experiencing what it's like to use like a mature um virtual machine like that is enough but also studying it is good too and it's not that complicated um yeah so uh second uh really which is most important to this project and and why we're talking about the jvm here is really just the mountain of value that's been created by the jvm so I want you guys to just like stop for a second and just think and reflect on How Deeply the jvm has embedded itself into the modern economy yes thank you just meditate on that uh so again there's 30 years of jvm software in the world creating conservatively like hundreds of billions of dollars of value nearly across all business verticals like you might even say it's trillions of dollars I mean I don't know but that's reasonable um and just like think about all the massive public companies that are just exclusively powered by the jvm think about ecosystems like Android billions of devices all across the world all powered by the jvm so you know it's huge right so uh yeah this must this not wasm Enterprise Edition this must be an Andrea plant uh we're not wa announcing wasam here unless you want to give us a big contract then we will sign it uh yeah so what I wanted to talk about was uh okay right so all right so yeah we have 30 years of software great so why are we talking about the past right why why do we care about this 30 years of software so this talk is largely about embedding wasm into the jvm but should we be focused on compiling all this jvm code to wasm instead uh well I don't think so I mean I think you know we believe like the reason uh the most compelling reason to use wasum is that it can unlock the full value of your applications by making them programmable platforms so um we think like in the future uh having a wasm interface will become the norm for Integrations because wasm will help us scale the complexity of Integrations by facilitating multiple orders of magnitude more integration more interactions than an htpa API can so the basic idea is like the more interactions you can do between two integrators the more features you can extend and customize the more things you can do it can unlock more value more customization and this is why we think you know embedding wasm into the jvm is an important path forward we can't just leave behind all the stuff that we've done and uh there's still lots of you know jbm software to WR in the future as well so uh how do we execute wasm in the jvm today well there are a number of really good mature wasm run times like wasm time wasmer really great run times uh the only issue is these are written in languages like C C++ rust and they're distributed as native code which has some pros and cons so the main downsides here are there's two things really um distribution so if you're linking to a native object in the jvm you have to um ship that thing off with your application or your library right so one of the main reasons you use the jvm is you compile to a platform independent B code that's kind of the point and then now you have like this whole triple of like you know dimensional problem of compiling for every you know OS and architecture and Lipsy that you need to ship this thing um the other side is on the runtime right so uh to communicate with some shared object on most systems you need to use some kind of foreign function interface and in Java there's a few different like kind of names for that but it's roughly the same and there's a lot of complexity and and problems um when you when your execution leaves the safety and observability of the jvm so let's talk a little bit about um some specifics about what that is uh so you know if you stay within the jvm boundaries then you get guaranteed memory safety which is uh you know they've been doing it for a long time it's a pretty pretty good guarantee um you get fault isolation uh there's no if your WM if your WM program is like jvm bik code then it can't crash to jvm which is kind of a major problem for a lot of applications uh you get like a super Advanced jit right so you can imagine when you use ffi the uh the jit uh from the jvm just sees the program as a bunch of holes right it's just going in and out um but what if it was just all one kind of stream of of jvm B code um and there's lots of really good reasons why you want to stay within the boundaries of jvm uh yeah so so what do we do about that this is another Andrea slide I think which I I assume is I assume what he's saying is we put the VM in the VM right yes okay got that's was the V intention yeah uh so we put a VM in a VM um yeah so the non- joke slide uh is uh this idea we're we're kind of trying to coin called a language Native runtime um so we don't really expect like this kind this type of runtime to like replace uh the state-of-the-art run times but rather to kind of fill in the gaps and maximize the portability um as well as provide like a better developer experience so a good analogy that we've kind of thought of is like might be the difference between say an F1 car or in like an off-road vehicle or a Jeep or something where your F1 car you're going to get the fastest uh the best performance you're going to have the best technology the con are it's you know not going to be able to run in many places if it breaks even a little bit you are you're in a lot of trouble uh whereas the Jeep you know it's very portable it's simple technology obviously not going to be the fastest but it's going to get you into a lot of different terrain and I could even see a way you know something we do in exotism is switching out the runtime where you need it right so it's like if I'm more kind of hostile environment I might use chory if I'm more in a known performant environment I might drop to some more performant runtime so we're not the first people to do this um we we kind of consider ourselves spiritual cousins to wero because they probably were the first I think one of the first run times to kind of pioneer this idea um and credit to Eduardo for this incredible piece of piece of art that should be in the Lou honestly yes uh yeah so although go and Java are very different you know like go compiles to Native binaries um Java this all manage like runtime they actually have very similar problems in terms of like uh having to reach out to native code um so these two projects like kind of have a a similar spiritual spiritual kind of like connection um and I also think that like this is probably not a problem unique to these languages as well I could probably like enumerate a few others but you know you could probably imagine like when you have a language with a pretty strong runtime or uh managed runtime then you might have some of these problems where once the execution or memory allocation escapes that there can be problems so uh we're going to introduce ticki Andrea is going to come in and talk about uh everything we've done and some of the technical stuff thanks man so it's a pleasure and today we are introducing our project which is chory you can find it in GitHub under uh Ben's uh organization dipo so please come check it out uh submit issues join our Z channel uh we love to welcome you into the project so please come and check it out um so what happened uh this is actually a really really quick Journey because I realized that it was just six months that I was collaborating with Ben and this is this was a side project for both of us and uh so uh what we have been able to to realize in six months and the fact that we are here today is actually really exciting for me and it was a really extremely great journey um Eduardo from wero introduced us because he went from our common employee our former common employee for him rette and went to work on yero and with Ben we actually envisioned uh running was were cloth on top of the gvm writing a Java native runtime so uh we really wanted to to see those kind of applications being enabled on V jvm because it's something that was completely lacking at them or uh it was doable it is doable but but with constraints and problems and to make it possible in six months we concentrated every of our energy on delivering demos and PSS on the app pass so on on integrating and developing the uh um the instru how to how to perform the instructions written in wasm for uh running programs actually so um building a language Native runtime those are a couple of uh a couple of notes that we have because we are not the first but we are following wazir of course and there are a bunch of things that you can do to uh make it easier to to write a language Native run time in in your preferred language actually so even outside of go or the GBM and the first thing that we did was to generate code for the op codes so keeping them in a tsv format which is very portable and very readable across different languages we keep a wasm corpus so programs written written in different languages and in wasm as well and compiled to wasm in a folder so that we know that we want have regression over those programs because we will run them on top of chory on uh during the tests and we are super heavily using native uh uh tools like Maven plugins for automating the wall of it um and in Maven plugins what we are doing the most is running code generators so we have only 13,000 line of code committed to GitHub but at the end of the day when you run the tests on your machine you have over 200 50,000 line of code for that are just generated tests from the test suite and this is actually a really powerful thing that we discovered and that sped up that speed up the development of the language Native runtime um those are the steps that you would need to follow if you really want to to to go down this path and and integrate uh the test Suite in um for writing for writing the the r time so first of all there is a tck for web assembly which is the web assembly test Suite you have to download it uh the web assembly test Suite is re uh contains a number of was files both was files are basically an extended representation of um the text format of wasm and since we don't have a text parser just yet but we have just a was binary parser uh what we have to do is to download external tools like was to Json to transform those was uh was files into something that we can read and generate code out of them uh so we download W Json as well we run West Json on top on top of the was files this generates a folder which contains the was modules in the binary format and a Json file that represents basically the assertions and the operations that you want to do against the compile the compiled on files and that and uh just parsing the Json and treating it as an n and generating assertions of out of it is the way to go for us for generating junit code so that we can actually really use a debugger for debugging each and every instruction and each and every assertion uh coming from the test Suite so uh next I really want to speak about the use cases for this technology because they were really really really unexpected even for us um I mentioned I I mentioned that I work for Reddit so I sent an internal email to other colleagues saying that I was starting and collaborating on this project and this was actually cool because um people from the J Ruby team reached out and said oh we have a problem for you you and they was like are you sure and this was really interesting but but but it makes a lot of sense I know you are tired but bear with me and it makes sense so um Ruby is an extremely difficult program to language to parse because its syntax is super super complicated and it enable you to do a lot of stuffs but it's super complicated to par so what they are doing is reimplement a ruby parser in purec so that it will be the foundation and the official parer for all of the Ruby implementations from now on and that's why prism Bor and then we have J rubby which is Ruby on the jvm as you might expect but Ruby is a very very very sea community so most of the things that they are doing after the interface that you see uh in Ruby they are actually implemented in C so you need to ship and execute native libraries for running the code that that is actually there and in J Ruby they always had this problem that they share with the rest of the uh Ruby implementation but they really didn't want it to have because one of the obvious advantages of running Ruby on V jvm should be that you don't have native dependencies again back to the initial point and they they can't really bootstrap because even the parser just trying to parse a ruby program needs an external library and this is a problem for them because they need to know your architecture and this is not portable and then you have to cross-compile your application for different architecture and if it is like a new arm maybe it's not supported yet by the par by by the library and you have to recompile your library and so on and so forth and actually prism can be compiled to wasm code and it is already running in JavaScript for example for uh showing the as of Ruby programs and what we did was to take this compilation Target for prism and make it run into chory so you just download a dependency which is super lightweight and which is super uh super easy to import and that's just plain Java bite code that you will be able to execute on any jvm without downloading nothing that is specific to the machine you are executing on basically what we are solving is the bootstrapping problem so people won't have to download any external dependency any additional stuffs to actually run any uh any any J Ruby program on top of the jvm to bootstop the environment we slow weow slower yes much slower but still this works and this enable more use cases now let's uh move to more interesting uh use cases and I want to elaborate on this one uh because when I got hired in redite I actually I was uh part of the klock team and klock is a very popular identity management software it enabled you to do single sign on basically and all those kind of operations but what you know is that this is like spaghetti code all the way down and it is made for integr for for integrating any kind of Legacy system you can you can you can have active directory on one side El up and then caros and things like that you don't I know that you want to rewrite things in R but you don't want to rewrite W integration in R I'm sure I'm sure and all this is the power actually for of java uh those Enterprise application you can't really get rid of them easily in any way so you want to actually uh Empower them and enjoy their usage on in in different ways and maybe what you don't want to do is to write a Java plugin because you really need to do something custom in klock and in klock what they have is what you expect so if you want to match for example and check V identity of a user against against different providers or to verify his identity inserting some custom rules for your business what you have to do at the moment is as I said as I told you to write a Java plug-in but why there is no real reason and now we just importing a library and it took me like two hours to do an initial a very initial and proof of proof of concept so just importing a jbm library is you will be able to to write your plugins in any language that compiles to wasm like rust SEO or Java JavaScript or whatever and this basically makes those applications which are Enterprise and heavy Etc easily extensible with a system which is really really lightweight for the project itself because for them means just importing a library and making it you able to to to run to load a file and to to load a was some wasm code in it and this is super useful so basically there are a lot of more use cases that we have exploited with a various level of PS um but making this extension mechanism are is making you able to scale the extensibility and the complexity of your applications so aach camel is a very very popular integration framework they they enable the producer for producing and transforming data into into the integration that they are doing um with wasm it was extremely easy is extremely useful because normally what you do is again customize with Java and this is enterprise software that can be sold and supported by big companies but now you have even was module so you will be able to run Ras code where and uh C code or go whatever to just do your Integrations without touching the Java part of it uh Kafka I know that you heard about it you with Kafka connected you can perform a Transformations and with Croy issues is a proxy to Kafka and you will be able to do filters so you in just messages coming from coming on a topic and you can transform them and perform some operations like encoding decoding encrypting and whatever on top of or routing even on uh with a proxy poer another very very well-known Java Pro Java project which is a message queue but is used all around so um last but not least I want to talk about our road map so uh it's just six months the project is still really early and is still a lot of fun to to to join and to program on on it so first of all we want to make all the test green with interpreter of the test suite for now we have excluded the Sim dis support because we don't have capacity for for tackling even it and it will require Java 21 with a vector API so a few more additional complications will be down the road but this will be the first goal but we are already mostly there like more 90% of the test are already passing so there are just a few bits to be uh fixed at the end we want to implement the validation logic we cannot claim to have a secure runtime a secure and send box at runtime until we implement the validation part of the web assembly specification and unfortunately to just enable the first use cases and to to make it real to run something on top of chory we just skipped this part so far and we are starting to to implement all those validations logic that will be throwing exceptions all around in the Cod base and yes we want some wasi preview and support because we want to be able to to perform easily operations for example on files because this is a very very common use case that we have had so far and at the end of this journey we we can claim we have a runtime that will be able to safely run your programs and but conform to the specification and we want to tackle some performance improvements we are sure that there will be a lot of low anging through it and a lot of other things that we can do even if it is just an interpreter uh to improve performance over the current uh over over the current Cod base and last but most exciting that we want to get to is an aot compiler so uh we started with an interpreter to have a solid base and a solid understanding of a web assembly specification it's a journey as for us as well into learning and understanding each and every detail of the web assembly specification as soon as we are done with an interpreter and it will uh deliver uh we can act we will be able to actually translate was uh wasm code directly to Java bite code and we do expect to make this fast or that's our Desir goal so just to show you that I'm I'm not I'm not doing something I'm not claiming something that is not real this is for example running uh Lama 2.c so Lama 2. C you have seen in a a couple of talks before this one that is inference Etc this is running on pure CPU so is extremely slow you can see that it will take take time to get something out but yes is working he running in backround so I mean this shows that that there are actually real workloads that can be run H probably we should improve over them but yes there is something and is moving and uh very last but not least I want to thanks all the contributors because we are luck enough to have a lot of uh great colleagues and people uh contributing to this cbas and I really want to encourage you to come and join this community of amazing people in this early project that we are enjoying a lot to develop and to bring to you yes I think we have four minutes for questions uh had off a reveal JS it's the first reveal JS presentation that I've seen um weirdly also great t-shirt and that's why I'm going to ask you this question um how do what's the overlap and how do you compare the work that you're doing to I think the grvm Wim support that they have that I know is a I don't know very much about no no no no no uh in grm they they what they built is basically a Justin Time compiler for any platform with truffle which is which is an API for letting you write an interpreter for any language and in particular they have an interpreter for was which is called truffle was that runs on the graian um recently just recently they made this available on any jvm so this is portable actually even uh it was not the case a few months ago uh but now you can Port it around with some penalty on performances Etc but you can Port it around so kind kind of the goals are similar but you have to bring with you an entire just in time compiler and we are aiming for Simplicity and to keep the dependency really with a as little code as we can and make it light and easily portable to any jvm so um kind of same goal kind of same way of execution different different tradeoffs in the implementations I would say I think you just you answered my question it's also going to be a lot [Laughter] truffle yeah all right well thanks for sticking around really appreciate it [Applause] [Music]
Info
Channel: WASM I/O
Views: 739
Rating: undefined out of 5
Keywords:
Id: 00LYdZS0YlI
Channel Id: undefined
Length: 28min 23sec (1703 seconds)
Published: Tue Apr 16 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.