Linux and The Tragedy of the Commons

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello there my name is gary sims and this is gary explains now one of the researchers at google has posted a blog saying that maybe the linux kernel is under staffed by the tune to 100 engineers now how can that be true when there are thousands of contributors and millions of lines of code and so many patches that are posted is really linux understaffed well if you want to find out more please let me explain okay so we all recognize that linux is a phenomenal success it is the biggest software collaboration project in the whole of computing history and it has produced the linux kernel and of course around that come all the other things that we use from desktop stuff to tools to utilities to system monitoring web servers email servers the whole stuff all built around the linux kernel and it's used in servers across the world from mainframes all the way down to something small like the raspberry pi but of course because it is so large because it is so successful then the responsibilities the expectations what people want from it grow in accordance with how popular it is and that's true of course every project if nobody was interested in linux nobody would care if it was understaffed in any way or they wouldn't care if the processes uh the software development processes weren't working as they should be but because it is so important because it is so popular then these things need to be looked at now the security researcher uses the analogy of a car if you have a car that takes you from a to b you got there in comfort didn't have any problems you say well that was great that was easy and linux is the same you're running it on your raspberry pi it's running on your web server it's running on on a big mainframe when it works well thank you linux that's great but what happens when things go wrong what happens when it breaks down in particular when there are bugs in the linux kernel bugs in the tools that we use around the linux kernel and then of course these can lead to security problems how are they addressed and how are the fixes pushed out to the users of linux now some people have a misconception about linux they think it's you know it's perfect there's nothing wrong with it any development that's done is just for bringing in new features and and making things even greater but here's a shocking uh number for you the linux stable in its kernel has 100 bug fixes per week 100 per week so it's 400 450 per month bug fixes some of which can have security implications and often the security implications aren't known at the time the bug is fixed only later on when further analysis is done of let's say one particularly nasty but they say oh good job that was not exploited because that quarter also led to security problems so bugs are fixed because they're bugs not necessarily because they are security problems but think about this way you are now a vendor of a linux based technology android for example uses the linux kernel you've got all these web services you've got you know amazon and google they're all running all of their services based on linux now at some point there's a bug found 450 a month in fact how do you roll those into the new version of android how do you roll those into the latest version that's running that's keeping you know up the banks up and running or keeping the websites up and running how do you do that now most linux servers are not running the greatest and latest version of the linux kernel now there are different types of linux kernel there's the latest and the greatest under development there are what's called stable kernels which get some point up releases while the next kernel is being updated and then there are long-term support kernels and all of these kernels need to have the same bug fix applied to each one of them so for example the android loses linux now android 12 is using linux 4.19 at its best it's actually able to use older kernels well now 4.19 came out and went 20 21 no 2020 no 2019 no 2018 yes so even the latest and greatest version of android is using a linux kernel that can be several years old and when they find bugs in new versions of the linux kernel they go back and say does it exist in the other ones and it has to be back ported so what can happen is that each vendor is looking at the current bugs that are being fixed and it says we have to fix that bug that was in the latest kernel we found out it's also in this older kernel that's what's running on our production services it's vital that we keep that secure and safe so they have to back port those fixes so there are internal engineering hours used to do that back porting and then that effort is duplicated across multiple different vendors under a whole bunch of different circumstances so the same exercises repeated time and time again independently across the world in different places and this is where we get the idea of the tragedy of the commons now the commons in uk and ireland is the term used for common grazing ground now if you've ever heard of the wombles of wimbledon it's the wombles of wimbledon common are we would uncommon it's actually near to where i grew up is actually a common grazing ground historically and what it basically means is public land that anybody can go on it's not owned by any particular private person now of course when you are have grazing rights on common ground if there's no kind of rules if there's no kind of structure from society from the local village or whatever that's implement that's using that land and implementing those grazing rights then of course you can get over grazing and the resource gets depleted because everybody's looking for their own interest i want to put extra cattle on i'm going to do it but i want to put extra cattle on i'm going to do it and you get this same thing as an open resource linux is an open resource free to use by anybody then you get the people say well for my interest i need this bug fixed in the production that's running on this huge set of web servers our whole business is based on i want it fixed now oh but so do we and so you get these people duplicating all of their efforts okay which actually ultimately detriments from the kernel itself because those are hours that are being wasted in doing the same thing that could be used on improving other areas of the kernel so some vendors might say we can't fix 450 kernels now you and i know whenever windows or the mac says you've got a new update you need to reboot we kind of go oh no i need to reboot i've got all these tabs open in my web browser i've got all these programs i've got to save all my artwork that i'm designing i've got to all this code that i'm writing i've got this document that i'm doing got to shut it all down it got a reboot and of course we just as consumers find these reboots because fixes have been applied as a pain now imagine if you are running a huge thousands of servers running some big website okay and they say every week another 100 fixes need to be done you need to keep rebooting so you'll be rebooting all those servers every single week there'll be lots of time you've got to install it there's a whole bunch of management issues about getting that running so of course a lot of vendors they just say well we're not going to fix 450 bugs a month we're going to look at that 450 and decide which are the two or three most important ones and fix those because those are the critical ones of course the problem is how do you detect those so again you get a duplication of effort because they're all trying to look at the bugs and say does this have an impact on the kernel that we're running in our production version and when it comes to security there is the cve numbers the common vulnerabilities and exposures and that does give a guide about how severe a particular bug is but it itself is an imperfect system there have been cases of bugs that were fixed and they were never even assigned a cve number only years later someone said actually that should have been assigned a cv number because that's really nasty if they had actually exploited that we'd have been in real trouble he never even got a cve number and sometimes cv numbers are assigned but only after the several months so it's not a perfect system so if you want to be ultra secure you have to fix all 450 bugs which in itself again is not practical so the google researchers reckon there is a lack of up to a hundred engineers who should be working on the linux kernel and on the surrounding tools including the compiler and those 100 engineers should be working to make sure that the kernel remains stable and has as few bugs as possible and actually at the time of writing this program there are hundreds and hundreds of open bugs that are documented logged this is a problem this doesn't work that are not fixed in the linux kernel hundreds and hundreds of them and they need to be fixed because each one of course is a potential security issue but of course each one is also a potential issue for lack of stability and for data loss and for all kinds of other problems that could be occurring some people they seem to think that linux kernel is just magic it magically it just works magically these things happen in fact i had a really bizarre conversation with someone in the comments when i was talking about um a cbl mariner the microsoft linux interview they were saying if microsoft switched to using linux for the windows code they would just need one engineer who would need to compile the kernel every so often and that would be his only job and i was like going you clearly don't understand the magnitude of the work that goes in to getting the linux kernel working and the magnitude of back porting and monitoring and tracking and releasing and any software engineer would tell you just ask them about the philosophy of release management and change management and you'll have a really long conversation you may find it boring but the point is this it's not just the case if you need a guy to magically compile the conan hey i've got a lyrics girl that's great it's way way more complicated than that so this is a clarion call from google who itself does have internal engineers that just work on google's internal systems but of course it is also funding external engineers who just work on upstream stuff now google has been upstreaming its bug fixes on android and chrome os and so it's saying that everybody else should be upstreaming their work rather than just working on the downstream stuff whether that will happen or not we don't know but this is a common resource and it needs to be managed in terms of security and bug fixes not new features not where it's going in terms of maintenance it needs to be managed so that every device from a smartphone a raspberry pi right up to a big mainframe to a supercomputer can benefit from all of those bug fixes okay that's it my name is gary sims this is gary explains i really hope you enjoyed this video if you did please do give it a thumbs up i hope you're following me on twitter at gary explains i also have a newsletter go over to gary explains.com type in your email address no spam just the newsletter okay that's it i'll see in the next one [Music] you
Info
Channel: Gary Explains
Views: 50,041
Rating: undefined out of 5
Keywords: Gary Explains, Tech, Explanation, Tutorial, Linux, Linux kernel, Google, The Tragedy of the Commons, Linux Security, Linux bugs, bugs, security bugs, linux vulnerabilities, security vulnerabilities, linux kernel, CVE, security, security exploits
Id: BhTQyeEdnzs
Channel Id: undefined
Length: 11min 14sec (674 seconds)
Published: Thu Aug 12 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.