Releasing Faster with Kotlin Multiplatform

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Good afternoon. So we're going to talk about some business problems that we had at Cash, which ultimately we solved with business solutions and sort of inadvertently led to developer productivity shifts. So to set some context, we have Android, iOS, and web apps. They're all developed separately and they're all sort of native to their platform, using the normal way that each of the respective companies want you to build apps. We do two week release trains for mobile apps. So every two weeks, there's a branch cut. Actually, we just switched to one week now, but every week or two there's an automatic branch cut. If you make it, you make it. If you don't, you go on the next one. And then there's a gradual rollout to customers over the next one or two weeks to ramp that up to 100%. And we have people that track things like monitoring new crashes or whatever. So that's sort of setting the stage for where the app is. Through this, we had a bunch of problems that sort of came up where we had to come up with creative solutions. So Cash deals with money transfer. This is me sending a payment to my friend, Alec. And in order to send money in the United States, you have to get a money transfer license, which is actually separate for every state. And so each state can actually have different requirements that you have to adhere to. So what's required in say California or New York might actually be different than what's required in Florida. And a state like Florida can come and say, well, you're sending this money to a person, that's fine. But we also support sending money to businesses and some of those businesses are nonprofits. And Florida can say, whenever that happens, you can't call it a payment. You have to call it a donation. And so also, by the way, you have seven days to comply with this or your money transfer license is revoked. And so seven days is actually not insurmountable. On the web, this is no problem. Merge a PR, do a deploy. This is out in a day. Android and iOS, because we build our apps natively, has a longer tail. It's anywhere from 3 7 days because it has to go through QA, and then we have to do that. We have to decide whether we roll that immediately to 100% of people, do we go through a normal rollout phase? Do we do an accelerated rollout phase? What if we're in the middle of an existing rollout phase? How does that work? And so this is one of the problems. So this is the kind of thing that happens, it can happen three times in a week and then nothing happens for three months. Or, it could be consistently every week there's a different state. You have these 50 states, and while all of them are similar, they like to do these... Bring these things on you and you have to sort of figure out a way to make that work. So that's one problem. Second problem, this is sort of common, where these product people just want to modify how a screen is displayed by testing how that deals with engagement or retention or whatever. So I want to put a new row in this profile screen or whatever, or I want to reorder documents to be higher up to see if that increases engagement or discovery of your tax stuff. So there's ways to do this. But that's problem number two. And then problem number three is, because we deal with money, there's a lot of ingestion of data. And it varies from user to user. So we might need your social security number, we might need your birthday, we might need your license, bank account info, you want to add credit cards, email address, phone numbers. We pull in a lot of data, and then we also have to show you a lot of things for legal reasons. We might show you sort of various terms, things about how money gets transferred. We might ask you things throughout the app. And so we have all these screens that are very structured and form like. And there's hundreds of these throughout the app. The app is very much like an iceberg, where 95% of people just see the tip of it, and then there's this huge part underneath, which is these hundreds and hundreds of screens for which each user might only see like one or two ever. So these are three separate app. Three separate problems need three separate solutions. And these are not... These solutions are not actually the premise of the talk because we solved these many, many years ago. But we solved them all with different things. So for the case where we have to update display strings, we actually ship JavaScript engines in the Android and iOS app. And we do like a JSON in JSON out sort of rendering step for a payment. And this allows us to update the logic behind rendering. But we don't affect the screen layout in any way. But it allows us to do these compliance things. And what's nice about this is from the time where you merge a PR with this logic change to when the... The code is sort of deployed on the CDN that the apps can pick it up is like eight minutes. And then it's just a feature flag flip away, a version bump on a feature flag before we get that new stuff rolled out. So it's very different from 3 7 days plus native app rollout. We can turn this around in an hour if we need to. For something like doing A B testing, modifying how a screen displays a couple items, we sort of just use a regular feature flag system. These effects are instant, but it means that the implement... The behavior that the feature flag controls has to be already in the app. So we have to know ahead of time that we're going to do the A B test. But then once that's in, you get the immediate feedback. And then finally, for this like form based case, where we have these hundreds and hundreds of screens, we use protocol buffers to represent sort of like a semantic set of controls. And then you can build up these screens declaratively on the server as data. And then that gets sent down to the client whenever they make a request. So I go to send a payment to our friend and the server says, hey, you actually have to fulfill this requirement first. We call them blockers because they block the next action. And there could be multiple of them. And it's these forms that are dynamic that accept all kinds of input or show you a confirmation screen or whatever. And so all of that is dynamic, but the controls that we actually can show you have to be baked into the app. So if I want to show you something like enter your social security number, and we've never done that before, that's a new thing that we have to add to the app, roll it out through the normal mechanism, and then the server can start asking for it. And it will show like the separation of digits for a social security number. But if it's a type that we already know, like a phone number input or whatever, credit card number, and we want to ask you to link that, then we can show that across all of our all of our app versions. And so what the... These are three separate solutions for three separate problems that manifested sort of organically over the course of a decade of Cashapp. And each of them has advantages and disadvantages. With JavaScript, we can change the logic, but not the rendering. With feature flags, we can sort of change the rendering, but they have to be baked into the native, the native binary. And then with the protocol buffers, we can change the rendering, but we can't actually change the logic that has to be baked in. And so we had this goal of unification of these problems. And you can see in... There's a lot of sort of productivity advantages to each of these solutions because we've changed that multi week waiting period into anything as low as 10 15 minutes for the JavaScript solution or instantly for the the form solution. But if your needs outgrow the choice that you made as to which technology to choose, you're stuck. If you have the form and you need Dynamic Logic, you can't do that. If you have the Dynamic Logic, but you need to tweak the rendering a little bit, you can't do that. And so you're sort of stuck with the choice you made, even though you got all these nice advantages. Eventually, you outgrow them. So we had this project to unify, how all these are done and sort of come up with a system where we get all the positives of each one and lose all the negatives. So we want logic to be updated outside of App Store releases. It simply just takes too long to release this stuff. And with the type of app that we have, the requirements are that we need to have fast turnaround time. This is not for 100% of screens. It's sort of ones that are focused more on the... Like, compliance and sort of data presentation and not these like real richly interactive stuff. And we don't want to come up with a... We don't want to produce a split brain problem where half the app is rendered one way and half the app is rendered a different way. So you'll see this with some frameworks where they take over sort of rendering of these screens that are done in a cross platform way. And then you have sort of the native screens. And if somebody goes and changes the corner radius of a button, that doesn't reflect in the half of the app that took over rendering. So we want a single source of truth for all of these components. Buttons that are put on screen natively and buttons that are put on screen through the dynamic system need to have a shared source of truth, needs to be the same native UI element. And also, we want to avoid sort of the uncanny valley of these frameworks that try and reimplement a lot of things like the iOS over scroll physics or the ripple on Android when you touch the buttons. Like, we just want it to be the native UI because native UI is what we think is best for our customers. And we just don't want to alienate our existing developers on iOS, Android and the web. We want them to still feel like they have ownership over how these things are created, how they behave. And also, we don't want them to be the same on every single platform. The fact that an iOS button might look and behave a little bit differently than an Android button is a choice that the design team might make and we don't want to get in the way of that. So basically we just want to say like it's not our problem to figure out how to render things. That's still the native UI. And so as part of this, we also want to create screens that are entirely unknown by the app when we ship it to the app store. So this is similar to the case where those forms can be unknown to the app, but since they're just data that ends up rendering on screen, that's much simpler. This is going to have logic as well that needs to come down. And then you have to tie into sort of the navigation system of each app. And then I sort of touched on this, but we don't want to like regress any of the native UI development that... The native screens that still exist in the app. We don't want to have our Android or iOS developers be like cursing our names behind their back because we're causing them to use like an older UI toolkit or not able to use sort of the modern conveniences of their IDE or libraries or whatever. So this is a lot. And this is a cross platform thing. So we need to choose a language to write in. Before for our language choice, we were just doing JavaScript. We were writing JavaScript, shipping JavaScript, but because it was so tightly scoped, it's like we put up with it. But for something that is going to encompass much more of the app, we need to pick something else. So we looked at a lot of languages and we actually chose Kotlin objectively, which is... I'm an Android developer, I love Kotlin, but when we were evaluating it, it actually became the right choice, objectively, when we started listing pros and cons of other choices. There's a few reasons for this fantastic IDE, existing knowledge of our Android and server teams. But the most important aspect is actually a somewhat unique way that Kotlin compiles through different platforms. So there's actually multiple compilers for Kotlin that target different backends. And so you take Kotlin that's so called common, and then you can compile it with either the JVM compiler, a native compiler, or a JavaScript compiler, and then use those each on their respective platforms. So the Kotlin/JVM compiler produces Java class files, which is the same thing that the Java compiler produces. And so it feels and behaves very native on an Android platform. Take that same code, compile it with Kotlin native, and it produces something called LLVM IR. Well, you know what else produces LLVM IR? The Swift compiler. So they both bottom out in the same sort of thing, and then that gets compiled to native code and runs natively on iOS. And for JavaScript, you compile your Kotlin code with the Kotlin/JS compiler, and it actually produces JavaScript that obviously runs natively in a JavaScript VM. The same code gets compiled into multiple different ways, which is in stark contrast to other languages we looked at, like Rust, Swift, TypeScript, or whatever, which I think we think are superior, from a language design perspective, but are inferior from the compilation mechanism, because they basically all turn into either native code, which then is much harder to interact with from Android and on the web, or in a JavaScript VM, or TypeScript just turns into JavaScript. And so we're still stuck, basically having to write TypeScript. So we chose Kotlin. We were already using a JavaScript VM, but because of the way Kotlin works, we decided that we could do a little bit better, because one of the problems with JavaScript is that it's untyped. So it's essentially strings in, strings out. We send it a JSON Blob, we get a JSON Blob out. The keys in that JSON Blob can change inadvertently on either side, the fields that are required. It's extremely hard to keep track of the requirements on both sides. And so we had all kinds of problems in the wild that you only discover in the wild when you ship to older versions of the app. So we built this library called ZipLine, which intends to completely abstract away the JavaScript virtual machine. And we abstracted away so far that actually we plan to completely replace it with a WebAssembly virtual machine eventually. And the idea is that if you're writing Kotlin and targeting the Kotlin/JS for running in the JavaScript engine and Kotlin/JVM for running on Android and Kotlin Native for running on iOS, then you never should actually have to know that there is this virtual machine underneath. And so if we have just sort of a normal class, this class that does that payment rendering I talked about earlier where it decides what string gets displayed, what we do is we split that up into an interface and an implementation. The goal of which being that you program against the interface and then we actually take that implementation and we move it into the JavaScript engine transparently. You never see it. So you're just programming against an interface like it's a normal... Implemented by a normal class running within the Android app or the iOS app sort of natively. But actually, we are transparently... Every call, every function call you make, we're transparently passing that across the JavaScript bridge and calling it on the associated JavaScript object. So you expose... This implementation gets exposed out of the bridge through this. And then on sort of the client side, you can take it and you just get an instance of that interface that you don't even know that the implementation is running inside a JavaScript VM. But we get all the type safety that we didn't have before with the... Talking to the JavaScript engine directly. And so now we can go and land a PR, build the JS artifact, ship that to the CDN, we can throw away the existing implementation while the app is running and load a new implementation from the CDN and get that brand new fancy logic without the calling code ever even knowing what's going on. And because the code on the bottom and like the code that has the interfaces is also Kotlin code, we can share that between Android and iOS. So even that code, the code that interacts with the shared interface gets to be shared across Android and iOS and then the implementation runs transparently within the JavaScript VM. So that's all nice for just data, right? Like I'm sending in this payment object and getting out strings. But one of our goals here is to... And that allows us to do things like change logic, but our goal here is to actually render UI, render screens. So how do we do that? We can't just sort of transparently send over model objects that represent sort of like UI elements, because then, how do we do interactivity? It's very static, and it becomes a challenge. And so separately, so Zipline is sort of a fundamental thing that we built just around having Kotlin run in two places at once transparently. Separately, we built a tool called Redwood, which is basically this abstraction on a design system. And I'm going to talk about what that is. So we're doing a programmatic representation of the components in your design system so that code that is shared across multiple platforms can interact with it in a way that's common. So here's an example of just like the most basic representation of something simple, which is I have the ability to show a column, and I have the ability to show a text input. And within these things... And so we write this in Kotlin for various reasons, but these types don't actually get used. They're sort of just like a schema. And so you can have properties, so like the text property of a button or the hint property of an input field. You can make containers, so the column obviously needs to be able to hold a set of children. You can represent one or more sets of children on a widget, what we call these. And then there's events. So the user clicks a button. The user presses or touches and holds on something, or the input field actually changes its input. So this is a very high level representation at sort of a design system level of like what components can be put on a screen. And it's very subjective. So like if we want to build a representation of something like this, you can make this really coarse grain representation, where you have rows and columns and images and text, and then in the code you'll figure out how to arrange those. You'll make a row with the image and then a column with the two text fields or whatever. And that's fine. That gives you a lot of flexibility. But the reason we call it a design system schema is we really want you to have these high level components that are much more semantic to the controls that are being shown in your app. And so we have this contact item that has the specific fields with really nice types, like a URL for the image. And then for texts, we have things like maybe a font family, if you have one or two fonts and a font style so that you can have title, subtitle, text, whatever. Instead of just offering arbitrary... Maybe like arbitrary text size or whatever. So which of these do you choose? It's like Redwood, the library we build doesn't actually care. It's up to you to decide with your design team what the level of abstraction is for these. And then from that, we generate two things. We generate something called composables. They look very similar to the schema that you defined. And these are the functions through which you build the screens in the logic. And so you don't really need to know what they do, but the way it ends up looking is something like this. And if you're an Android developer, it probably looks fairly similar to something like Compose UI. But it's actually not, it's like these functions are your design system, and you get all the sort of type safety and whatever. And this is the thing that has to be learned. It's a way of representing your state and the structure of your UI. And what's gonna happen is as this code executes, which we'll see in a second, it's gonna interact with the native app to coordinate how Native UI gets put on the screen. And the way it does that is we generate something called these widget bindings, which also look exactly like the schema that you defined, has the same components. And what these do is they are the actual abstraction of the Native UI toolkit. So we have this widget interface, which holds the Android view, the iOS UIKit or UI View, the DOM element when it's running on the web or whatever. And then from that we take an interface, say like for the column that has a set of children and you just wrap whatever the native version of that is. So on Android, for its view system, that'd be like a linear layout. On iOS, that would be like a UI stack view. On the web, it'd probably just be a div. And so when these things are combined, what ends up happening is the compose execute your function, it runs like linearly down. And as it's executing, it's creating these node objects. And so it runs through your function, it goes to its children, it starts adding these other things, and we essentially build up this tree of nodes. And that's nice and all, but that doesn't give us the actual rendering. And so what we do to turn that into a normal platform Native UI is that as that tree gets built, each node gets associated with some corresponding platform native component. So I showed you like the column gets mapped to a linear layout or whatever, rows and columns. And so as the tree gets built, we're simultaneously building like this associated tree in the Native UI. If you're a web developer, it's sort of similar to how like a virtual DOM gets turned into an actual DOM. And so once we have this trick, we just get to recreate it on every single platform. Well, we know how to build the tree on the left. So on iOS, we build... We have these very simple mappings to the native tree on the native types and the UI system on the right. Same thing for web. As the tree's getting built, we just build the associated DOM nodes. And now you wind up with what somebody natively would've built, but through a common representation from those composed functions. So I know that's a lot. There's a lot that goes into this, but it's hard to skip over that and get to what it ends up looking like. So I think it's important to see that even if it's not fully understood, there's certainly a lot more detail to go into there. But here's sort of... I got three, maybe four real world examples where we'll build up to looking at how we actually use it in Cashapp. This is like the ultimate simplest example. It's just a counter of that increments or decrements. And you can kind of see what's going on here. We have the widgets that are button and text and they have the click listeners that increment or decrement account and display it. And then the column that can hold sets of children. And then this is how composed does state management, which you haven't seen. If you haven't seen... Maybe you've seen like React or whatever, which has... It looks very similar where there's sort of like these state containers that you can then read and write too. And so this gets compiled, if we're just doing vanilla Redwood, this gets compiled to all the platforms. And what we can do is we make those simple bindings. So for the text, we bind it to Android's text view and then there's like some boilerplate where we do one time setup on Android and then display our composable. All of that to say is what we wind up with is... This is a screenshot of Android Studio, this counter app running. And I don't know if it's kind of hard to see, but these are native views that are being put on screen, and then I can interact with this and increment or decrement the counter. Android has a new UI toolkit called Compose UI. And like I said, we don't wanna hold our developers back, so if we wanna bind these interfaces to Compose UI, we can do that. Get a little different setup for Compose UI. And now the app is running with Compose UI, where all I had to do was implement a couple interfaces one time, five line setup or whatever. And now that shared logic, that shared counter logic of incrementing and decrementing a button is exactly the same, hasn't changed, but it's now displaying in an entirely different UI toolkit. The web, same thing. Here's a screenshot. You can see there's actual button, and we use a span for text elements. Column is a div, like I said. It's using Flexbox there to turn it into a column. And then iOS, I only have an example with UIKit, but SwiftUI also actually works. So similar screenshot of the Android one where you can see the iOS, UIKit view hierarchy. All of that to say, Redwood is sort of the technology of this abstraction of a design system. Zipline is the ability to take Kotlin code and transparently run it inside a JavaScript VM to have it be updated. And the combination of those two is this thing we call Treehouse. And so wouldn't it be nice if that counter sample, I could update the logic without having to recompile the app, which is the thing that we set out to do in our goals. And so half of this boilerplate code, this one time code you write is dealing with setting up the Native UI toolkit binding, and then the other half is dealing with setting up the shared logic, that compose code. And so what we want is to take the half that deals with just the presentation logic, the compose code, and run that inside the JavaScript VM, and then the half that deals with the native platform to actually be compiled to the native platform, Android in this case, because that means now we have the ability to update an entire screen's worth of logic. Not only the logic, but actually the nodes that get presented on the screen. So if I wanted to change the row or the column to a row, and have it be rendered horizontally, I now can do so without the native app ever having to get recompiled. So this is sort of the pipeline of how they communicate. And what I can do is I can throw away that entire top layer now, download a new thing from the CDN and bootstrap it without the app ever having been restarted. And there's a lot that ends up going into each of these boxes, but the idea behind Treehouse is that we wrap up all the ceremony on the right. And so all you have to worry about is writing you, the person building a screen, is writing that presenter, the actual logic and the structure of the UI that's gonna go on the screen and then writing the bindings to whatever Native UI toolkit you want. Okay. This is the next... This is the next demo. This is actually a video where I'm gonna show you what the developer experience of this is like locally. So this is one of the samples. It's running on Android, iOS, and web, and it's an emoji filtering app. So I'm gonna type in tree for Treehouse and see the emojis that have a tree, but I don't know what those emojis actually mean. And so I'm gonna go and uncomment this text label so that I can see what the emoji name actually is, and I'm gonna hit save. And there's a terminal there running. And in about four seconds, it recompile the piling code to JavaScript and then it deploys the JavaScript onto the running app. Because this is in development, it actually just live reloads. So you can see Android and iOS live reloaded to have the label. And then the web is the web. So I guess we still could have the live reload, but we don't, we just tell you to refresh. And so as I'm building this, right now, I'm only modifying, I guess. Well, adding the text is changing the elements that are being displayed. But I also can change the logic. So like I just duplicated the label twice. And again, you don't actually have to it save, it saves automatically. And then four seconds later, you see the result in the UI. But that's just a sample. This is our actual Cashapp. This is one of the two screens that are currently implemented in this technology called Treehouse. And what we're gonna do is show this in practice. So like the text rows are not very visible, so we're gonna make them bold. There's an icon missing. So this person, Guy, is like verified. He should have a check mark. And then we'll put an icon next to each person. So like, Nike has the Nike logo, and Guy has his face, or actually doesn't have his face, but has a letter. And so this is the real IDE. We're editing the app or whatever, and then we hit save. And then again, there's a terminal down at the bottom. So this is the real app. It's a little bit slower. It's a whole eight seconds that we have to wait twice as long. But like, if you ask an Android developer doing a native screen, how long it takes them to do something like this, when they hit deploy, they're lucky, it's probably a minute or two. If they're unlucky, it's 10, 30 minutes, depends how big the app is. So here, I'm uncommenting like an if statement that hid the avatar. Hit save. This one's actually a little bit faster. It's only about six seconds. And then you see that now we have an avatar next to the thing, each item. So these are the native screens in the app. They look and feel native. You would never know that it's running in JavaScript VM behind the scenes. And this is just another example of a screen written entirely in Treehouse. And if you look, you can hopefully see what the components that we design are. There's probably like a tile component, and that probably has a title and maybe like an enum or a variant for what is displayed within the tile. And so we can do things like reorder these and all that stuff without having to update the app. So the last thing I wanna talk about is this idea that like we embed the JavaScripts but then we have the ability to compile this on RCI, deploy it to the CDN and then update the apps. And this is still... I talked about at the beginning. It's when we were doing just JavaScript editing, it was about eight minutes. We actually haven't regressed on that at all. It's still eight minutes. So if I want to go and make one of these changes in our app today, if I absolutely wanna roll it out as fast as humanly possible, I can get the PR to merge, and eight minutes later, I can flip the feature flag and it'll be seen by our customers. So both of these things are open source, but they're more like brick and mortar rather than a house. You have to build a ton yourself. You break your own design system, you gotta define that whole schema, which takes a long time. You gotta build all the infrastructure for deploying the JavaScripts and compiling it on CI and doing the feature flag stuff. This is not a turnkey solution for everyone, but I think the results are at least for us in comparison to what native app development, native screen development has been like, is really unprecedented. It's allowed our developers to have this insane local dev turnaround time but also to get things into users' hands in sort of a unprecedented amount of time. And that's all I got. Thank you very much.
Info
Channel: Gradle
Views: 10,762
Rating: undefined out of 5
Keywords:
Id: aSvidgk4vgc
Channel Id: undefined
Length: 32min 24sec (1944 seconds)
Published: Thu Mar 07 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.