Kevlin Henney - Software is details (Keynote)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
here you are looking at the roof of um part of charles de gaulle airport in paris and you can see there's a lot of detail and i think we're used to the idea that in the real world there is a lot of detail in everything but we still abstract it people often think of architecture building architecture as not the detail level and we suffer the same the same problem in software we have an issue where sometimes people ignore the details they regard them as just details they often say oh that's just a detail don't worry about it what i would like to put forward to you this morning to make a really strong case for is that software is details that is precisely what it is so we can't dismiss an aspect of software say that's just a detail because we don't yet know which of the details that are important which are the ones that matter and indeed this is the challenge or part of the challenge of software so my name is kevin henny this is what i look like in the real world when i'm standing up um you can find me online though at a number of different places uh twitter linkedin via email so if you want to follow up follow up on social media i'm good with that we'll i'll try and make some time for questions at the end so please put your questions um in the chat uh i will i may watch that i may have a look a couple of times at the um the message stream um but that's asynchronous to what i'm seeing and saying um uh through this system so uh that might be a little bit dissonant so we'll we'll see how that one goes um so my background i work for myself i i run um workshops training consultancy i write um and it touches on all of the levels of detail if you like you know so the big picture stuff software architecture you know the kind of patterns the design rationale and things like that all the way through to a couple of books that i have edited 97 things every program should know which was about a decade ago and then last year during lockdown um with tricia g uh from jetbrains we finished uh 97 things every java program we should know and this there's a whole aspect here of detail and consideration of detail and what i like is when we try and challenge ourselves and say well what is it that we do what is it that we build in software and there are lots of metaphors and people find lots of creative metaphors and these are very powerful you choose the right metaphor and it changes the way you look at the way that you develop the way that you build the way that you code the way that you envisage architecture and each one of these different perspectives offers us some kind of insight they are all at one level or another potentially valuable but sometimes they hold us back if we start thinking of details of things that don't matter and then everything else is the stuff that matters but we don't need to worry about that we are very likely to find that the thing we said doesn't matter has a habit of mattering so i'm going to borrow this definition um from maya manny layman from 1980 and i i love this definition because i think he must have had great fun writing this sentence any program is a model of a model within a theory of a model of an abstraction of some portion of the world or of some universe of discourse now kind of unpack that one slightly and try and memorize it and translate it as necessary because the next time you're at a party and somebody asks what is it that you do this is oh you know i work on programs that are models of models within theories of models that's great you know you might not get you a lot of friends but it'll certainly sound impressive and there's a point here that what we are doing at each level is we are describing something an idea a concept and one of the key words here is not just model or theory but abstraction and there is a tendency for us to think of abstractness as being vague high level not committed hand wavy okay there's a lovely definition from headscard dykstra who highlights actually what is it that abstraction is what is its purpose abstraction is the removal of something okay that it comes from the latin abstrahera it means to draw off like to draw off water we talk about abstracting water from the ground well we don't really not these days but we use abstract to mean something that has something removed abstract art is art of the world but with the world removed if you like an abstract class is a class and yet it is removed it has had parts removed effectively an abstract method is a method without a method so we use the word abstract and whenever i ask people what is an absolute you know tell me what abstraction is they say oh you know it's where we try and understand the details no no no no you're telling me what good abstraction is tell me what abstraction is it's just removal why we do it the purpose of abstraction is not to be vague which is a danger because sometimes we find that people are doing this we find it something whether you're called architect uh or developer people end up without meaning to being vague oh the abstract view and there's no commitment to detail and that's the point that's the goal of this as dijkstra says the purpose is not to be vague but to create a new semantic level in which one can be absolutely precise the reason we remove things it's not just because they're annoying not just because it's noisy and we can't understand but is to find our way to a point of precision so this semantic level idea is important now semantics is a word that we sometimes throw around and we can find it in computer science and we can find it in philosophy and we also find it when sometimes people have an argument and you know whether it's a political debate a religious debate um you know a debate of the everyday world or a debate of technology at some point somebody will say something oh that's just semantics it's just you know i i'm pretty sure i've said that and people dismiss something they say it's just semantics it's just like well what does the word semantics mean sometimes we hide particularly in english it has english has many layers of words um and semantics is a classic word is from the classical world um let's just make it more conventional english it's just meaning it's like well hang on what else are we arguing about it's just meaning that's the whole point there is nothing but this and we also have the same dismissive attitude it's just details now the importance of this highlighted by alfred north whitehead um you may not have heard of he was a british mathematician and logician in the early part of the 20th century but with bertrand russell they wrote or they created this proof principia mathematica which was um published in 1919 was it 1911 the earlier part is the 19 teens the early part of the century where basically they tried to demonstrate them and prove that mathematics was complete and consistent um but it turns out that's not actually true as kurt gerdel in 1931 demonstrated um actually you can't have a system that has this these two properties of completeness and consistency um as highlighted inadvertently um here we think in generalities but we live in detail the detail is where we find the problems the detail is where we find the challenges the detail is where we find and funnily enough a lot of the meaning so there's this idea of the big picture versus the small picture and and in a book um think like an artist will gompers has this lovely way of describing the the mindset of an artist but when i read this i thought wait a minute this is software development there's a there's a very we're talking about a very specific mindset that is crucial when it comes to the act of creating it is an attitude that can be encapsulated in a simple but demanding this yeah it's simple to describe but it doesn't mean it's simple to do always think both big picture and fine detail if you think about the way that you work with code versus the way that people try and describe it um when they talk about process and methodology the way that you most effectively work with code is you move from okay zoom back let's understand the bigger context then zoom in focus on the detail and then zoom out and that cycle of zooming in and out can happen over minutes and studies have been done that demonstrate when people are doing such technical problem solving tasks they cycle up and down and up and down through the different levels and so on over matters of minutes what is interesting is that when people describe things methodologically and they took oh well we're just you have a workshop and you focus on the high level things you have a meeting and then you focus on the high level things and then you do this and then you do this it's like an airplane landing you start at the big picture and then slowly you descend it's a very elegant picture it's unfortunately not at all how humans work and they they sort of talk about time oh this happens over a sprint or over a day no these things happen over minutes you're supposed to not just have a descent it's supposed to be a cycle and the cycle is measured in minutes not months okay this is really important sometimes people get stuck at the high level so we often accuse um those in software development often accuse those in management of being stuck at this level but then there's the other point where people get accused of being stuck in the detail software developers are often accused of not pulling up and cycling through we see this in other disciplines i mentioned architecture before and i found it interesting because don't forget when we look at other disciplines we should learn from them not sometimes directly sometimes indirectly it turns out that they these other disciplines are populated by human beings and they suffer exactly the same issues and though so on this particular uh social media feed uh leave autists which is um for architects building architects and they talk about this idea this and sometimes they try to highlight the difference between the idealized view of architecture and actually a lot of software people when they criticize the met the building metaphor architecture they criticize a stereotype of architecture not actually what architects do and here we see the very same thing architecture oh this is this beautiful pristine this kind of big picture image and all the rest of it the problem is that life is life is what architecture is for architecture accommodates life it's all about the details it turns out that sometimes our huge grand gestures in building architecture are as problematic as our huge grand gestures um in software architecture we kind of like you know people put together an architecture based on powerpoint they think of architecture as the big blocks in the system i want to demonstrate that this is absolutely not a helpful view so i want to imagine going to paris actually i would like to imagine going to paris my my uh my travel has been somewhat restricted the last couple of years so i'm sure yours has and so we have a view architecture um paris has a very characteristic blend of architecture but what i want to do is consider first of all a definition of architecture grady pooch observes that architecture represents the significant design decisions that shape a system in other words these are basically all architecture is designed but not all design is necessarily architecture architecture is the significant design decisions but what do we mean by significant is it size well kind of but it's not size in the way you think about it because it's not size like in buildings buildings are built with respect to the human scale so therefore we measure them the largeness is to do with us but how does that work with software there's a there is kind of the idea of can we fit it in our head do we have the cognitive bandwidth to understand something but that's not quite as strong and clear as a physical metaphor but there is a way of looking at it significant is measured by cost of change how hard is it to change something if you have to push against it really hard then clearly it's big so we're talking about the bigness it's not to do with how much space it occupies on a powerpoint diagram the bigness is to do with how difficult is it to change and often we don't know that in advance and we must discover that by looking backwards and what that highlights to us is that sometimes the things that are significant to change are not the things that initially appeared big they are the things that are networked everywhere they are built on our assumptions they are in fact the details that are ever present not the big picture view so let's go for a big picture of you let's look at paris so here's google maps view paris and you can see straight away you can see the rivers saying you can see all the key roads you can see some of the most significant tourist sites you can see in the green spaces in the city this is magnificent this is great i can also see the routes that i can potentially drive out of the city or into the city what i can't see however is what direction or what side of the road people drive on now of course there's a point of um as a point of just general knowledge yeah you know france drives on the right like most of the world but that's not visible on the map that's kind of an assumption here okay well let's zoom in now we see more details we see more sites are highlighted there are more roads in fact we can now use this as a walking map we have enough detail here for a walking map but look again you still cannot tell on what side of the road people drive only the next level of detail in do we get indications there are arrows on the road and at this point we start discovering you know what actually they don't necessarily drive on a particular side of the road most of these are one-way roads that is the that is one of the big big challenges of paris is is it actually mostly um particularly in the center a lot of the roads are one way so our assumption has been invalidated what we see is a general uh sense of flow there is one indication right up here right up here okay that driving on the right is the default but we have so many exceptions to the default here now the point so the point is at this point only at this level do i understand how i can drive around paris and yet the big picture did not show me that and this is an interesting one because this these assumptions are everywhere but even if we are dealing with a city where everything is by default what if we change that now that's not an experiment anybody's likely to run again very soon but you know the last european country to change signs of the road that it was driving on was sweden in 1967 and that he required a huge national effort and it's probably something somebody couldn't do now because there was less dependency on cars what we're saying is that flow the flow is the one thing that people don't represent on the big diagrams and yet it's so important and this is an important characteristic in in software architecture as well as building architecture or urban architecture flow this is really important people have a huge blind spot about this they re i've got you know this list goes off the top of the screen and off the bottom of the screen what is the definition of control flow in your system now a lot of people oh that's just a detail you know i've i've met people who say oh multi-threading that's not architectural that's just a detail yeah good luck with that it turns out it's a huge assumption because what you're doing is these are all different ways of expressing the flow of control and the flow of time in your code and when you change from synchronous to asynchronous when you change from single to multi-threaded when you introduce code routines in a non-co-routine environment if you take a code base that is in a language not depend upon exceptions and you try and translate that into something that does use exceptions or vice versa you will find there are significant changes that your work is defined by these changes and not details of course they're not details you're defining the physics of your application you're defining how time flows that is not a detail that can be ignored it's the details is everywhere and if you try and change it you will discover it's a significant change but you never see it on a big diagram so therefore people have this blind spot they kind of relegate things oh you know control flow that's like that's like ifs and else's that's a small detail yeah kind of except when it isn't we also have other things we think oh naming is not important or rather naming is important we argue about naming to a great deal but we still many people still think of it as a detail so so here's clonable let's have a look at clonable um so that's clonable it's an interface in the jdk part of core language been there forever since back since before java 1.0 it's even you know the idea even jumped across to net however one of the most interesting things is that i mean i know you know english spelling is an absolute nightmare um it's uh it's very historical let's let's be polite it's very historical and sometimes people say about english spelling they say you know english doesn't have a logical system of spelling that's actually not correct english has about 10 or 11 logical systems of spelling all at the same time that's the problem it's not that it doesn't have one it has more than one and in as much as there is a correct way to spell clonable it's like this in other words it doesn't have that first d and if i recall correctly the release notes for java 0.9 said that they ought to change the spelling before too many people depended on it but that never happened and now people think in software that clonable is spelt with that extra e in fact it's got to the point if you go to wiktionary and look up the definition of clonable there it is that's how that's the appropriate spelling the accepted spelling and then under alternative forms especially in computing contexts there is clonable with that extra e i want you to think oh yeah but kevin that's just a detail yeah sure tell me how much of a detail is tell me how much it would cost oracle and microsoft to change the spelling of that at that point you suddenly appreciate oh okay yeah that's quite big it turns out that naming is naming architectural the answer to that question is it depends we only understand it by the dependencies the inbound dependencies on it if there are lots of dependencies on a thing then it is harder to change and the further away those dependencies are from your control the harder it is to change so your goal in architecture is to enclose the things that you need to um that you want to have change and create boundaries so that the things that the boundary are hard to change the things inside the boundary easy to change in other words make your code make your code easy to refactor otherwise you will find that small details become horrifically hard to change it's also there's a there's a an off off-quoted principle um the open closed principle it's based on a misunderstanding of what the real open close principle is um uh which i won't go into here what the real one is uh the misunderstanding unfortunately has a lot of people believing you should apply the open close principle as a general principle in your code that everything should be open to extension and close to modification absolutely not a good architecture is one that avoids the open close principle as much as possible we want our we want our code to be as open to modification as possible maximal modification it's you know we this is the modern architecture we believe in refactoring we believe that we're not perfect so we need to refactor what we need to do is arrange our architecture so that it allows that so ocp is a negative recommendation um uh do not follow it create an architecture in which you are not following it uh or don't have to follow it everywhere that's a really important distinction otherwise you will end up being trapped by exactly this um spelling mistakes um as i'm sure the http referrer before anybody mentions it is another good example of a spelling mistake that's um made it everywhere and almost yeah it's almost impossible to change now we've talked about control flow we've talked about various things and sometimes people say well we just need to prove our programs are correct okay right fine let's so let's go back to 1983. it's this is so sometimes people say well we need to put more effort into making things right yeah sure the whole of human history is that we it's we've been here before and it's not that we can't do it it's just that we can't do it perfectly um and there are some really interesting examples so here in um a piece which we're going to discover is actually ironic or accidental accidentally ironic um a piece by john bentley algorithms guy wrote this uh series of columns programming pearls and which get into books programming parts and more programming pearls i was very very influential very um really good books uh certainly influenced me in my early days these books are kind of the 1980s and 1990s there's a lot of wisdom here but there's also a lot of assumption and here he demonstrates how to write a correct binary search okay binary search is one of those things that's incredibly easy to describe but people often end up with off by ones and things like that and so he comes up with an algorithm and this is basically he kind of uh he describes here in pseudocode this is the simple way of doing this uh and then he expands it with all of the invariants the assertions that must be true at various points and he demonstrates he says you know one of the main major benefits of program verification so he's talking about trying to prove the correctness of your programs is that it gives programs a language in which they can express this understanding these techniques are only a small part of writing correct programs keeping the code simple is usually the key to correctness so that's a good starting point but what is interesting is that he observes he says on the other hand several professional programmers familiar with these techniques have related to me an experience that is too common in my own programming when they construct a program the hard parts work the first time while the bugs are in the easy parts and this is interesting because we find this actually sometimes but the way that people prioritize their own testing work this is all only we'll only test the hard parts you know obviously if you've got a finite amount of time and a semi-infinite amount of code and you're trying to test this so a legacy code base then yeah you have to prioritize and looking at the complex bits is a really good starting point but don't fool yourself into thinking that that's enough and this is by the way why you know you should be writing tests alongside your code however you choose to do it whether you use td or itl or any other technique um you know when we talk about coverage uh statistics statement coverage people often aspire to 100 percent i think that this shows a complete lack of ambition on our part why would for want less than 100 how could i have achieved less than 100 should be the way i should be looking at this less than 100 statement coverage that statement coverage is one of the easiest things to get so why would i ever settle for less of course we have an issue with our legacy code but what about our new code now there's also this blind spot that we have you know when they came to a hard part they hunkered down and successfully used powerful formal techniques i'm not going to say that was actually true in the 1980s either it wasn't true in the 1980s it's never been true it's just not a thing that's ever been true but people who have used this will normally try and prove the hard part just as people prioritizing testing particularly when they are thinking about new code or perhaps they haven't really understood the value of more continuous testing or when they're testing legacy code they'll go and look for the hard parts good starting point however it's the easy parts where they return to their old ways of programming that they end up with the old results of mistakes and let's look at how this plays out so here is binary search this is the binary search implemented exactly according to bentley's algorithm um and it is implemented by a student of um uh uh john bentley and joshua block who was responsible for the java collections and this is the binary search that was found in java collections okay it was introduced in java 1.2 and in 2005 something occurred um and there was a bit of a discovery so it turns out that this actually works fine until you hit very large arrays at which point there is actually revealed a bug when you're looking for the midpoint between two integers if you have a mathematics background or you thinking formally you probably add the two together and divide by two it's the mean of the two positions and the simplest way to calculate the mean is take the sum and then divide by n which is exactly what you're seeing on the screen and you will also notice on screen it's red red means it's wrong this fails it fails for very large arrays if you have a large array um uh that where low plus high takes it over 2 billion then you will find that this wraps around to be very negative and this starts failing so what you actually need to do is find the difference between them find them find the midpoint of that value that that uh ordinality and then add it to the low that avoids wraparound that bug was present for decades it was discovered as a bug in 2006 so that bug had been in that out published algorithm for nearly quarter of a century and if we go back and look at it here's the assumption here it is the idea that we're dealing with integers but remember when you use computer except in a few languages you're not dealing with integers you are dealing with a subset of the integer domain and the assumption here is that l plus u div two is a valid mathematical statement um and it's actually not in certain cases so what we're revealing is an assumption as we observe you know assumptions are kind of the barefoot trodden lego bricks in the in the dark of knowledge okay you assumptions are a curious piece of knowledge you you don't know they're there until you know that they're there you step on them and you go oh that was an assumption and even if i tell you you have assumptions you won't be able to find them until you stumble into them you might say i have assumptions but i don't know what they are you don't know what they are until they are contradicted so assumptions are very curious piece of knowledge you only understand them in the contradiction okay you don't know you have an assumption until it's contradicted and then you go oh i had assumed that you know your colleague tells you something and you go oh i had assumed that and we see here exactly that we have assumptions about other aspects of our code let's go back to 2016 there's a lot of really bad stuff in 2016. um another and the um the left pad incident um with javascript which actually there are now more javascript programmers more people programming javascript than were actually programming than were programming javascript at the time um in 2016 and so many people are not familiar with this incident and basically you know people were depending on a little piece of code that just did padding um so there's a kind of a culture sometimes in in uh in some programming language communities and i think it's i'm gonna say it's the worst in javascript but i'm going to say it's the worst in javascript in javascript people just like say yep is it on npm let's pull it in people are doing anything but writing their own code and there's a there's a balance where we have to think reuse or write do i control my do i understand my dependencies because sometimes people are taking on needless dependencies they're just depending on a thing because oh it's there somebody's written it and i'll trust it i would suggest not trusting code especially for small things okay um when you have a large when you are dealing with a large framework with an active community supporting it that is of architectural significance that makes perfect sense but bringing it in bringing in small bits and pieces adding a dependency where you don't need it gives you more to control but it also decreases the knowledge you have of your own code base it increases the assumption set what you want is to understand your architecture most people don't understand their architecture they say they do but they understand the bits they wrote the problem is the bits they wrote are at the tip of the tip of the iceberg of everything they depend on and then if that doesn't all work you're in for some surprises so a lot of people depended on this function and they didn't and and they they depended on code that depended upon it so these people didn't even know that they were depending on and here is a function that pads the pads a string now using a proper language like python then padding a string is something you can do that really easily but the web has chosen javascript you know with that this is where we are there is now a pad left function in javascript as a response to this now this function you know somebody's written this it's available it's open source and we should always thank people for their generosity but at the same time i've always it doesn't mean that just because they're generous that it's good this is not great code yeah it's very non-idiomatic from a javascript point of view i feel very uncomfortable looking at it i'm not a javascript programmer i'm a javascript tourist i look at this it's just like yeah this really needs some tightening up on kind of like the basics you know this is this is kind of like somebody being enthusiastic the right stuff but they came into conflict with a licensing difference and they pulled this from the npm repository and this had a domino effect causing lots of websites to fail websites that didn't even know they depended on this this detail of filling a string they actually put up another version this one again this one's even more from a point of view of understanding what a good coding practice an appropriate programming practice when i read this code this is this is very worrying because this person clearly doesn't understand javascript at all um and has put a lot of comments in which really make it unreadable um now as i say i'm not javascript programmer i had a go at doing this and turns out you can kind of do it in just a few lines but what i did is i did something radical i wrote tests for it it turns out that details matter okay write write some tests for it document your assumptions even though you don't know the area sections and that's what i approached i approached this i thought i'll have a go at this and and this was normally where i'd leave the story i'd say yeah i read some code it was it was shorter i wrote some tests this is what the tests look like when they pass let's see it's all green my tests are very very simple they were just kind of like simple smoke tests let's document my assumptions or my beliefs about what the code should do give them nice names an expression that should work in a simple testing framework and when you run this testing framework it all passes it passes with my code just fine and then i ran it against the other two versions of left pad guess what they're not properly tested they fail in a number of very well-defined cases that apparently people either worked around or hadn't noticed this is the interesting thing this detail when you take on these assumptions we were informed of this 50 years ago there's a 1970 or 1971 paper by 71 by david parnas the connections between modules are the assumptions which the modules make about each other these are the details it turns out we assume things about it's not just the types we assume how things work we assume their control flow we assume their stability their correctness and when so we need to better understand our mental map our world of code as it were when we approach a system so we might think oh here's the code this is this is what i'm focusing on here's the piece of work that i am doing this is the subsystem or the package i'm working on and then here's the other stuff that my colleagues or i have written in the past and this is kind of ultimately under our control this is our code and we depend on other code that is outside our system okay other infrastructure code that you don't own and we think of it like this um but we think of it as less important it's like in the distance so it's smaller it's not as important and then that depends on other code which we really don't understand at all and we let we let our package management tools uh and dependency managers look after it for us because it's not important it's just a detail and these also then depend on other things like the platform the assumptions of the platform and this also depends on other aspects of the market and the presence and governance so if you ever wonder where do where do their governance uh or the regulatory requirements for my domain come from because sometimes because oh they're the customer requirements they're not customer requirements they are domain requirements your customer doesn't have to ask you for governance or they shouldn't have to ask you for governance because that is the domain that you're working in the idea that all of your requirements come from your customers is a deeply flawed assumption it's not that it's it's not that it's entirely false it's just not entirely true when you are working in a particular domain that domain brings with it a set of requirements of how things work when you are working on a particular platform that that platform brings with it assumptions and requirements and sometimes we think oh the language people often say oh the language is not important yeah okay how easy would it be for you to change the language of the system in many cases we'll find that it's so intertwined there are so many assumptions about how the system is built but actually it's hugely architectural but to us it's invisible because it's everywhere all of the time so we don't see it we only think of the blocks and as we see with the left pad incident you know um and adrian collier observes this he says it's often not the direct dependencies of your project that bite you but the dependencies of your dependencies all the way down to transitive closure to be fully aware of your architecture means understanding all of that and i remember you know i've mentioned this a few times and i've had a couple of what do you mean all of them and they're looking at that like that's a lot of work and they thought i'm i'm doing reuse to save work yeah you are but you shouldn't save more than you can save in other words there's a minimum level that you should have confidence in and as you go through you discover things and you really want to create an approach where you discover things earlier rather than later and not in production okay this is why reviews and testing all of these things are really important and getting an early release even if it's imperfect or early development even if it's imperfect is important because you are more likely to stumble on the lego bricks than later in production so let's remaining in 2016 let's talk about the um uh the european space agency's exomars mission uh european space agency had already tried once before to land on mars landed a probe on mars this year we've had two mars landings um uh uh nasa landed perseverance and uh china managed first attempt not just to get to the red planet they actually managed to land a pro massive achievement that because mars is a planet that has a reputation for eating probes however things did not go very well for the lander part of the exomars mission which is called schiaparelli um there was a point here where scamper early was entering mars atmosphere and the internal measurement unit went about its business of calculating the lander's rotation rate so the lander on entering the atmosphere it doesn't just enter it it rotates for stability okay and you can imagine that entering an atmosphere is a is a it's a violent and very energetic process um and so it's basically trying to calculate the kind of the wobble okay because it's going to wobble slightly as as it enters it's a very turbulent um stage and then there's a bit of jargon here the imu calculated saturn saturation maximum period that persisted for one second longer than what would normally be expected at the stage so something didn't go quite right and this calculation for whatever it was wrong and it ended up calculating leading to a calculation that calculated a negative altitude in other words that the probe was below ground now one of the great things is that acer publishes all its reports and but it also uses that wonderful euphemistic language of space travel anomaly so yeah feel free to um use this phrase in your own uh in your own work you know to doubt anomaly it's like a catastrophe and normally oh it's just slightly off you know it's a little bit strange um because of the error and the estimated attitude that occurred at parachute inflation turns out that the issue remember i said the wobble let's simplify it the wobble basically oh half a degree this way half a degree that yeah you're wobbling you're wobbling half a degree this way then half a gradually that way means zero except that it was accumulating half a degree this way half a degree half a degree this way one degree half a degree this way one and a half it accumulated where it should have um had an appropriate sign uh checked so therefore an erroneous off vertical angle with respect to the vertical what's the the angle of the probe deduced a negative altitude and then this lovely little just reminder the cosine is the cosine of angles greater than 90 degrees or negative it's just like this little piece of school trigonometry just thrown in there why is this important let's understand cosine of vertical how far there is an assumption the assumption is that this is always greater than or equal to zero so when it goes into the next equation next piece of calculation we know that everything's right but this is an assumption it was never stated as this but now with hindsight we can see the assumption is there but actually this isn't the assumption the assumption is actually that the off vertical will always be less than 90 degrees therefore okay and given that everything only works in radians let's let's switch to radians but we can phrase this as another thing precondition the precondition for this working is that we have an off vertical less than or equal to to half pi and we can actually phrase this in a structure that is more familiar from a point of view of formal methods and formal proofs uh an idea that preconditioned operation post condition is described originally by robert floyd and then formalized by tony haw in 1969. there's a simple idea here the precondition then the operation and the post condition what must be true afterwards these days we're more likely to say given when then it's it's a we've been reinventing this kind of rule of three for the last um half century given that the off vertical is less than 90 degrees when we take the cosine then that result is greater than or equal to zero and if that's always true then it's all good otherwise we're going to be moving to surprises but another way of looking at it is what i what if i rewrite it like this what if instead of describing an assumption i'd create a branch and we see that we don't know what it's supposed to do and that in the question mark is exactly what happened calculate the altitude using that and we just did that this fateful miscalculation set off a cascade of despair which might be one of my favorite phrases um in uh technical english um so well done the facts of gizmodo for reporting like this and feel free again for you to use that phrase when something goes wrong um you know your company suffers an outage a cascade of despair so a couple of weeks back facebook suffered a cascade of despair when it kind of got its border gateway uh protocol a little bit messed up triggering so basically the probe thought it was below ground so it said oh my goodness i'm fully dressed i've got all of the stuff for re-entry for entry still on i've got a parachute i've got a back shell so get rid of these fire the thrusters and we're ready and you know put out the landing pads and unfortunately it happened uh nearly four kilometers above ground and yeah the rest is just gravity uh and actually the orbiter could identify the site where the crash was so there's a point here i mentioned testing a few times um simple testing can prevent most critical failures which is one of the most obvious helpful titles i've ever seen in the paper um so this is 2014. and these guys observed across a number of distributed data intensive systems predominantly written in java for distributed system purposes and they look through basically all of the changes in the bugs their observation almost all catastrophic failures are the result of incorrect handling of non-fatal errors okay in other words something goes wrong and we haven't handled that how we recover from it correctly they also discovered that a majority of the production failures could be reproduced by a unit test and that doesn't mean we can always catch them in advance but it does mean that retrospectively we are able to do that and so that's reinforces an observation that neil ford made in 97 things every programmer should know testing is the engineering rigor of software development this is where one of the places where we stumble across the details and likewise i said code reviews by the way i need to find there's a lot of people who are watching this thinking oh code reviews oh you mean like you know on on pull requests and stuff like that asynchronous code no i mean proper code reviews asynchronous code reviews are not proper code reviews just so many types that looks good to me done no i mean proper socializing the code sitting down with somebody you know remotely but sitting down with one or more other people and going through talking through the code i don't mean this kind of like async approach that that's become popular and a lot of people think that's what code reviewing is about no it's not that's kind of like if that's a backup if you can't do anything else increase the static analysis increase the test and increase the sociability the communication about the code if you're not talking about the code and somebody's just looking through that's not useful you need to have a conversation conversation is how we communicate that's where the assumptions become revealed you need to spend time rather than avoid time do it but also when we come to details there are things that are also there are so many things in our blind spot so let's go back to 2017 um but sticking with space travel um there was so this is a a launch um uh uh from uh starch knee uh the vestors which is way way way over in eastern russia um sawyers fails to deliver 19 satellites from vostochny um so yeah what happened here um i think there was one main satellite and 18 secondary satellites that were being launched um this didn't this mission did not work out um why did it not work out well all the hardware was fine um and it turns out that the three stages the first stage was correct second stage was correct and then it's only when it got to the third stage almost unbelievably the flight control system on the fraga the last part the upper stage did not have the correct settings for the mission originating from the new launch site investor historically many of the launches um for russia have been either in plesetsk which is in uh western russia or baikonir which is in kazakhstan and the rocket was originally going to launch from there and they correctly changed the configuration settings for the first two stages but not the final stage so basically when the final stage came alive it looked at where it was and it went oh my goodness i'm off course like basically about 12 time zones off course because it's been launched from uh eastern russia so it tries to get back to its original calculated course and it burns up all of the fuel doing so and it re-entered the atmosphere around iceland um there was a british airways flight that saw it re-entering the earth's atmosphere but here is this is a configuration error okay again another paper early detection configuration areas to reduce failure damage our study shows that many of today's mature widely used software systems are subject to lce's latent configuration errors in their critically important configurations now why am i highlighting this because a lot of people well yeah yeah but that's not code you know that's just configuration and that is the blind spot and i've said this a number of times every now and then i get somebody go oh no but code is not configuration oh you're so sweet that you think that let's let's walk through this so i walked through this in an article in a blog post about a year ago out of control it's on my uh it's on my medium blog i raise a question what then is configuration configuration is a formal structure for specifying how some aspect of software should run configuration is code if you don't understand the configuration is code go away get yourself a coffee and meditate on this and when you understand the configuration is code go back and look at your system what are you doing to review your configuration what are you doing to test your configuration okay your configuration is written using a full notation it may not be turing complete but it's still code it doesn't matter if it's binary it doesn't matter if it's xml or json doesn't matter if it's actually written in programming language okay the point there is it's still code but we have this blind spot configuration is just a detail it's not important to how a system runs oh yeah yeah i'll ask the russian space agency okay so hopefully what i've impressed upon you is a re-evaluation here and i want to close with an observation from uh robert percyk who wrote the book zen on the art of motorcycle maintenance i don't know a lot about motorcycles i'm not saying i know a lot about zen but i probably know a lot more about zen than i do about motorcycles um but it's a it's a lovely book it's a it's kind of classic of its time uh in the 70s i read it in the 80s um first time 90s the second time um and it really it's an inquiry into values and quality and things like that and there's this point he's talking about an engine the motorcycle engine has a problem a screw is jammed into it so therefore the engine doesn't work it says normally screws are so cheap and small and simple you think of them as unimportant it's just a detail but now as your quality awareness becomes stronger you realize one individual particular screw is neither cheap nor small nor unimportant right now this screw is worth exactly the selling price of the whole motorcycle because the motorcycle is actually valueless until you get the screw out with this reevaluation of the screw comes a willingness to expand your knowledge of it and what i'd like to leave you with is this observation indeed software is details that is what it is that's all it is software is lots of little details all put together in a particular way that must work okay that is what makes software challenging software is not hand waving which is a like hand waving that's not software software is a commitment to detail and it's a commitment that all the details should fit together and provide us with a system of executable meaning thank you very much okay so we have time for questions um let me uh let me just catch up with the chat hi alex uh catch up with the chat see if i can see [Music] we have we had some chat about the details of the code of the platform itself uh so stefan was was there to to to respond to it uh i didn't see uh something about the talk itself um okay right well let's just pick up on where we are in that case um so what do we got uh let me just look at the timings here um so there's that um actually interesting i'm just gonna pick up on something that stefan says earlier in the the feed um yeah yeah wrong line the schedule was scary i'm bad at time zones and thought i missed the start of my okay so that's tom cool's talking there and stefan says stephanie says i hate time zones oh my goodness yes time is i love them um their time zones are like times are the perfect detail i've actually had this conversation with the team where they decided that they were going to try and write their own time zone code because in their heads oh this is just a detail you know yeah we just add one hour every now and then and stuff like that and it's just like you have no idea what you don't know you don't know what you don't know this is not just this is a detail of immense magnitude and i demonstrated um you know i'm very interested in uh date and time handling uh and uh probably too interested um so one of the things that i pointed out to them is i asked them just two simple questions i can't remember what those questions were but i do think one of them was to do with leap seconds i can't remember what the other question was and i said i've just asked you two questions you don't know the answer you've got the answers wrong too what makes you think that you are going to be able to write something for a basically a global delivery system uh on which thousands upon thousands of uh direct users and secondary users depend that you're going to get the time zone right i said i've just demonstrated that and say well we budgeted two weeks to write this code i said well just use the stuff that you get with the operating system they were writing low level stuff and it's oh we we don't know how to go and we're sure we can do it right so i said give me give me half a day and i found out how to do it it was very ugly code but nonetheless i said the point here is that this will always be right whereas sometimes you will be wrong this will always be in tune with everybody's understanding of time zones according to the operating system rely on the operating system here but yes um so yes time zones are actually a good example of things that we might consider to be details in this respect and i think down in the chat we have a question about uh well done do you see it okay hang on let me just uh navigate the chat here okay let's move that one down so ah wait a minute so the question is do you predict that software formal verification will be a thing in the future uh like use that skill um i predict that i predict that it won't be in the way that people have originally um originally imagined it um so uh the i i don't think it will be i think there is an increased use of formal methods in certain domains um uh there's a one called tla plus that is uh is increasingly being used but there is a point here that the and these rely on the ability to do automatic verification and and things like that and it's a very powerful kinds of approaches the problem is that um there are a few stories of there are a few a few success stories and the what we're seeing is that there are very few there's still very few they will continue to be very few because the act of being able to formalize a system is um and state your assumptions is is called programming what you're effectively doing with any formal methods approach is there's something about the value of the form methods is it offers you a different way to decompose the problem and look at it but it's not necessarily going to give you a benefit in lots of the cases it's not that it's time consuming that's what all these extra tools are for you don't have to do all of this stuff by hand where we are going to see the value of such formal methods approaches not so much the four methods directly applied but indirectly applied the ability to have tools that do things to our code and say hey did you know here's an assumption and so on so in other words it's not in programmers sitting down doing formal methods it's going to be in the tools that take the results of formal methods and look for certain classes of things and we're already familiar with the idea that deadlock detection is something that happens increasingly in certain classes of system and is automated and so these kinds of things are examples of where four methods become valuable indirectly rather than as it were directly so any other questions which i'm struggling to navigate the chat here so um just trying to see what else we've got more more questions about the browser everyone is using so details you know yes yeah i can see that yeah and hey this is it from the part of the questions but they are great they talk as always thank you very much and so folks if you want to follow up with anything you know i uh you know my um my details are available um feel free to follow up on social media and otherwise with any other observations war stories or anything like that in the meantime i wish you a very good day enjoy the rest of the conference thank you very much
Info
Channel: Devoxx
Views: 1,403
Rating: undefined out of 5
Keywords:
Id: 9hlBEI-3Gro
Channel Id: undefined
Length: 55min 5sec (3305 seconds)
Published: Thu Nov 25 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.