Monorepos - How the Pros Scale Huge Software Projects // Turborepo vs Nx

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
a software company like google maintains a lot of code like seriously a lot like over 2 billion lines of code and nearly 100 terabytes of data to go along with it it's a scale that's hard to even comprehend like all the stars in the universe or all the universes in the multiverse and you have thousands of engineers around the world working on it simultaneously but get this they store all their code in a single monolithic repository and they've been doing it since the very beginning today google's monorepo is likely the largest code base in the world but it takes an extraordinary effort to scale they have their own homegrown version control system and a highly advanced build tool called bazel which goes by the much cooler name of blaze internally in today's video you'll learn everything you ever wanted to know about mono repos and how you the humble javascript developer can build a high performance mono repo in your garage if you're new here like and subscribe we're so close to 1 million and if you hit the button right now we might even make it there by christmas you may have heard some exciting news last week that vercell acquired a company called turbo repo it's a build tool written in go that makes it really easy to manage multiple apps and packages in a single git repository but first let's answer the question of why would i want to use a mono repo there are many reasons but at the highest level it gives you visibility of your company's entire code base without the need to track down and clone a bunch of different repos in addition it provides consistency because you can share things like your eslint config a ui library of web components for your design system utility libraries documentation and so on the real power though comes in the form of dependency management imagine somebody makes a breaking change to a shared library all affected applications will know instantly and monorepo tools can actually help you visualize the entire dependency graph of your software when it comes to third party dependencies a mono repo can dedupe packages that are used in multiple apps a mono repo is also ideal for continuous integration and automation because your code is already unified by default making it much easier to build and test everything together but there is one big problem with monorepos and that's the fact that they're big as the monorepo becomes larger there are more things to test more things to build and a lot more artifacts to store as a result vs code will lag trying to process the massive git history and you'll need a 20 minute smoke break waiting for everything to run on the ci server after every commit to operate a mono repo at scale it's absolutely essential to have the right tooling that's why facebook created buck microsoft created rush and google created bazel you just need a phd in order to use it luckily there are other options out there the most basic approach is to use your package manager like yarn or npm to define workspaces these tools basically configure your project with a root level package json which then has nested workspaces like apps and packages that are linked back to the root level project a cool thing about this is that it will dedupe your node modules which means if you have the same package installed in multiple apps it will only be installed once it also allows you to orchestrate scripts like if you want to build or test all of your apps at the same time that's a good easy place to start but if you're building an open source project that publishes a bunch of different packages then you'll likely want to look into a tool called learn which is a tool that can optimize the workflow of a multi-package repo here's an example turf js is a geospatial library that has a ton of different packages that are essentially helper functions or working with geolocation data each one of them can be installed as its own package and lives in its own subdirectory here on the repo lerna is the tool that it uses to help manage this workflow efficiently most importantly it allows you to publish all of your packages to npm with a single command these tools are great at configuring monorepos but they still suffer from the same problem i mentioned earlier they become really slow and difficult to work with as they grow larger one problem is the installation of dependencies if you're looking to improve your install speed and easy optimization to make is to replace npm with pnpm it's a drop-in replacement that will install your dependencies globally and sim link them and that can make your install speeds up to three times faster that's a nice upgrade but what really makes a monorepo slow is the constant need to recompile rebuild and retest everything and that brings us to the fun part of the video where we talk about an entirely different class of tools that can make your mono rebo operate at the speed of google the tools i'll be comparing are nx and turbo repo they both operate as what i would describe as a smart build system what i mean by that is they create a dependency tree between all of your apps and packages which allows the tooling to understand what needs to be tested and what needs to be rebuilt whenever there's a change to the code base they cache any files or artifacts that have already been built and can also run jobs in parallel to execute everything much faster we'll look at some example code in just a second but first what is the difference between nx and turbo repo nx has been around for about five years and was created by two x googlers turbo repo on the other hand was just open source the other day it was created by jared palmer who you might know from the react ecosystem with packages like formic at this point turbo repo is a lot more minimal than nx nx can do everything that turbo repo does and has a lot of features beyond it like a cli that can automatically generate boilerplate code for you a plug-in ecosystem a vs code extension and something called distributed task execution which allows you to distribute work across multiple continuous integration servers it's a pretty amazing feature from a technical standpoint and was inspired by bazel on one hand it's great to have all these features but on the other hand it might be a little more than you actually need nx has been criticized for requiring too much configuration although i don't necessarily agree with that if you're only using the core nx features that configuration is actually very minimal that being said turbo repo feels like a tool that goes out of its way to get out of your way for example if you're already using yarn or npm workspaces all you have to do is add a few lines of configuration to your package json and you now have a magically super fast mono repo when it comes to dependencies both tools visualize the dependency graph although nx does a much better job of this but again it's had a lot more time to bake in the oven another cool thing that both of them do is remote caching with turbo repo when it caches something it can also cache those files on for cell that means if bob builds the application then alice checks it out on a different computer the entire cache can be downloaded remotely instead of wastefully rebuilding and recomputing everything in the mono repo and that can save a huge amount of time for a large organization the final comparison point i want to make is that turbo repo is written in go while nx is written in typescript in theory this could result in faster startup times when running turbo repo but the underlying language is not where the performance gains come from it depends on when the computations take place and how they're cached but i'd be interested to see some benchmarks between these two tools now it's time for you to build your first monorepo i already have a video on nx if you want to check that out but today we're going to take a first look at turbo repo you can add it to an existing monorepo or they also have a template to start from scratch run npx create turbo from the command line and that should bring up a menu with some gradient text that says turbo repo it'll give you the option for a package manager and i'll go with yarn as the default when we open up the project in vs code you'll see that we have two root level directories for apps and packages where apps are the actual applications you deploy like a next.js or react app for example and packages are the different configurations utilities and ui libraries that those apps depend on now if we open up the root level package json you'll notice we have workspaces set up for apps and packages there as well then towards the bottom you'll notice the turbo configuration this is where you configure a pipeline to run tasks efficiently normally in a mono repo everything runs one by one you do all your linting first then you do all your builds then your testing and finally your deployments with a pipeline you can explicitly tell turbo what a task depends on like you can run lint whenever but to run your build you'll want to make sure that all the dependencies have been built first the carrot symbol refers to dependencies then to test the application we'll want to make sure that the build has completed and to deploy will ensure that build test and lint have all been completed a pipeline like this allows you to condense the timeline and utilize more cpu resources to deliver faster builds and that's pretty much all it takes to get started we can now build or develop all of our apps in parallel if we run yarn run dev it serves both of the next js applications in this repo at the same time in the terminal you can see where it's using the cache in the browser you get two apps running at the same time without having to deal with multiple processes an interesting thing to notice here is that if we go into the package json and one of the applications you'll notice that it has dependencies like ui that are linked to a star character and the same for config and ts config these are dependencies that live in the monorepo under the packages directory what's cool about this is that if we go into our ui package and make a change the change will be reflected instantly in both applications at the same time there's no need to recompile and reinstall a dependency everything is continuously integrated into the entire code base and the awesome thing about turbo is that everything becomes much faster by default like if we run the build command the first time around might take a couple minutes but when we run it a second time everything will be cached and that run might only take a few hundred milliseconds at least in theory there's a bug on windows right now that's ruining my video but normally this build would be done in a matter of milliseconds i blame myself for not using linux to summarize mono repos can be an awesome tool when you have a large complex project the great thing about tools like turbo and nx is that they dramatically lower the barrier of entry to scale a massive code base a mono repo might be overkill for your blog but if you're the technical co-founder of a company it's definitely an approach to building software that you want to be aware of i'm going to go ahead and wrap things up there if you want access to more advanced javascript development content consider becoming a pro member at fireship io thanks for watching and i will see you in the next one
Info
Channel: Fireship
Views: 672,697
Rating: undefined out of 5
Keywords: webdev, app development, lesson, tutorial
Id: 9iU_IE6vnJ8
Channel Id: undefined
Length: 9min 7sec (547 seconds)
Published: Mon Dec 13 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.