"Designing Dope Distributed Systems for Outer Space with High-Fidelity Simulation" by Toby Bell

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
good afternoon and welcome to designing dope distributed systems for outer space with High Fidelity simulation uh my name is Toby Bell I'm a PhD student in the space rendevu laboratory at Stanford University advised by Professor Simone Deo uh I want to say much thanks to strange Loop for inviting me to speak this year I'm really honored to be here especially for the last year of it uh and thank you to all of you for uh giving me your time and attention this afternoon so dope distributed systems for outer space with High Fidelity simulation um the in this presentation the primary thing that I want to share is a case study in advanced software testing techniques in three parts in the first part I'm going to talk about uh a spacecraft flight Control software package for an upcoming space mission called visor um that uh my team and I have been working on for the last year and a half or so it's dope uh it's a pretty cool piece of software if I say so myself uh I'm not biased but it's an unprecedented mission in many ways and certainly presents a lot of challenges and opportunities I'm going to talk about a way we've used a particular testing technique extensively in developing this software um and in my opinion this has been indispensable in us being able to create the software the way we have I think we've been able to move quicker test more thoroughly and ultimately create more capable software with than we could have otherwise um and this is not just for space I hope that these techniques might be useful to some of you um it can be applied really to a wine range of systems so that's what I'll be talking about today first up visors so visors is a satellite uh space mission launching ing next year in October 24 October 2024 um it will be the first distributed Space Telescope consisting of two 6u Cube SATs flying in close formation in low earth orbit uh for reference a sixu cubat is about maybe this big it's like 20 cm by 30 cm you could kind of hold it in your hand the science objective of this mission is to image the solar Corona in the extreme ultraviolet Spectrum with higher res resolution than has ever been done before and we're aiming to do that on two very lowcost 6 U Cube sets uh it's a very ambitious Mission given its low cost it ends up Towing the line between science Mission and experimental technology demonstration Mission so it's very exciting it's a multi-institution collaboration between I think 10 universities NASA GED space flight center a company called Blue Canyon Technologies um it was funded by the national Science Foundation I believe originally in 2018 2019 uh with an original Grant amount of $4.6 million um distributed Space Telescope like any telescope or like many telescopes we consist of an optical instrument and a detector but in our case uh to achieve the unprecedented resolution that we're going for the focal length of this telescope is 40 m that means we want 40 MERS between our Optics and our detector that's probably a little longer than the length of this room so uh because we do not have the budget nor the launch vehicle to launch uh spacecraft like that the approach we're taking here is to put the detector and the Optics on two separate spacecraft these spacecraft will align themselves with the intended observation Direction it's very sci-fi uh and this has never been done before we'll end up Hing pretty tight tolerances on their alignment so laterally we have a tolerance of 1.8 cm for the alignment of the two spacecraft and along the line of sight we have a tolerance of 1.5 cm and we'll be holding this alignment for an exposure time of 10 seconds in terms of the relative geometry of the two spacecraft we get an observation attempt once each time we come around the earth so when we're in science mode doing observations uh we'll get this opportunity about every 90 minutes uh highle concept of operations for the mission uh we broadly divide the mission into two modes standby mode which uh uses a slightly further separation of 200 MERS for increased safety and only enter science mode via a transfer maneuver for short durations to perform a limited number of observation attempts before transferring back to standby to give a visualization of this concept of operations I have a little demo that I can run and my terminal is gone okay I found it can I type up there am I okay I'll just have to look at the um the presentation here oh and it opened on this screen one moment let me just switch switch to mirroring no that's not what I wanted to do BR okay great okay so here's the Earth um here are our two spacecraft uh so let me give a a reference here so um we'll be at a altitude of about 5 50 km um we have two spacecraft depicted here this one is they're just white dots sorry I only had so much time uh this one is the detector this one is the optic spacecraft uh the reference frame that we're using for this visualization with this three-coordinate access is some astrodynamics for you is called the rtn frame we use it often in spacecraft relative motion and it's defined locally depending on where you are in your orbit so the r Direction which is red here always points up radially away from the Earth the T Direction in green points generally the direction that you're moving in the orbit and the N or normal direction is sort of like your angular moment momentum Vector that's normal to your orbit um in this visualization we've got a couple reference uh ovals here which are not perfectly positioned again I only had so much time but the green orbit roughly represents that 200 meter standby orbit the red orbit is that 40 m science orbit and we have an animation showing the uh transfer trajectory between them the yellow Vector here which we'll be moving is our intended observation Direction which is essentially just pointing towards the Sun at all times this will be moving around as the animation plays since the orientation of the rtn frame is changing but it's the same direction in inertial space so the transfer looks like this and our software uh plans a sequence of Maneuvers shown here as yellow points in the orbit that are designed to take us from the standby trajectory to the science trajectory and back and this transfer sequence takes maybe 10 orbits or so which might be the better of heart of a day um but it will gradually take us from one to the other while preserving desirable safety properties about the orbits so that we don't have any risk of collisions I'll fast forward a little bit to when we're in science mode and you can see that once we're there the science trajectory that we're in is constantly tuned by Maneuvers that are planned autonomously on board the spacecraft the loop needs to be closed tighter than we could achieve using ground stations so this all has to be done on board and we achieve this alignment with the intended observation Vector once per orbit so as that yellow Vector is swinging around that's pointing towards the sun you can see we just line up and kiss it once per orbit and each time we do that we do an observation attempt when we're done with all of those we start a transfer back to standby mode uh which is also done autonomously in the worst case or by ground command if we want to do it sooner so that hopefully gives a sense of what these two spacecraft will have to be doing mostly autonomously um again it's very sci-fi I think this is the coolest stuff so that's that and maybe I can go back can I stop mirroring okay they're all wrong where's my cursor am I on the display button you know I'll just mirror it'll be simpler our piece that my team at the space rev laboratory is creating is the GNC or guidance navigation and control subsystem for the mission this is a stateful software component that is responsible for doing all of the motion estimation and control for both spacecraft we run a mission mode State machine that runs on both spacecraft in sync to maintain that highle concept of operations of the mode switching we are responsible for maintaining safety of the formation by using passive and active collision avoidance and ultimately we need to achieve that required centimeter level observation alignment uh more generally my lab the space rendevu lab uh focuses on Advanced multi-satellite systems to enable precise navigation and control often with the end goal of Novel science applications environment characterization and uh performing rigorous validation and testing of these algorithms along the way so if you're interested in in more of our work you can check us out at slab. stanford.edu to give a bit of a context of how our piece fits in with the rest of the system we have two instances of the guidance navigation control Library running on both spacecraft that's wrapped in another piece of software called the hosted software app this is developed by a team students at Georgia Tech that runs on top of Linux on top of a flight computer Pro provided by the bus from Blue Canyon Technologies our commercial partner that's connected to a bunch of Hardware mostly sensors and actuators to tell us where we are allow us to control where we're going those are connected to reality we also have a cross link for doing networking between our two spacecraft um it's a lossy cross link and it's also a research component this is a a new piece of Hardware that's not flown before this uh we have two GPS receivers on both spacecraft which are connected to up to one of the 31 active GPS satellites that are up there um GPS this is incidentally is how we're achieving the uh very high precise High Precision navigation um we will be doing the first to our knowledge first ever use of realtime differential GPS with integer ambiguity resolution in space um which is how we can get cim accurate relative State estimates from GPS which normally is only accurate to like meters in the best case we also have ground radios so I want to give a bit of a sense of the actual API to the GNC system because it's relevant to the rest of my talk so at a high level this is I copied this from the header file that we delivered to Georgia Tech um we uh have just a statically allocated or I mean they can allocate it however they want but we just have a struct that's here's our working memory for GNC please put this somewhere we give them a function to initialize it they can provide some callbacks function pointers for us to get outputs to them and then we have a bunch of functions that they can call to give us various kinds of information at any time um I'll go through these briefly they give you a sense of the kinds of computations we're doing we take configuration data time at tone this comes from our spacecraft bus and contains attitude information our orientation and general system Health keeping um GPS messages these come straight from receiver Hardware uh laser ranges cross link messages from the opposite GNC instance on the other spacecraft ground messages Telemetry from the propulsion system which we use to react to tune our maneuver planning um the current Target formation level Target that's that high level uh Mission concept of operations and a tick which is basically a a request to do some work at a later time that we might have asked for in the past on the output side they provide a bunch of function pointers to us uh so that we can output Maneuvers instantaneously whenever we want one performed we can call that as a call back and they'll send it to the hardware power States for various Hardware our own Telemetry this is just recorded for down link cross link messages to send to the opposite spacecraft ticks those requests to do work at a later time we can output those uh sing before and after an observation the current mode for triggering various configurations on the spacecraft uh sort of fault detection flag for whether we think something has gone wrong in GNC and at a system level they should take a take a reaction um and some safety violation we also have a call back for us to just request the the current time the current system time we just use this for internal benchmarking of algorithms a sense of our code we our flight code is uh just a bunch of C++ we have three main libraries lib slab lib digital and lib visors slab and digital are shared with other flight projects in my lab visor is just dedicated to this Mission um you can you can see it's pretty tiny we go to pretty Great Lengths to keep the amount of code that we have small um this is not because our code is simple but I think we are pretty intentional about trying to avoid duplication and any given thing that we want to do uh we try to do it as absolutely simply as we can because there's enough complexity to juggle already we have a few third party dependencies EOS is a second order cone program solving library for doing numerical optimization this is used for one of our maneuver planning algorithms uh and sofa standards of fundamental astronomy is a library that contains a bunch of coordinate frame definitions and other sort of uh astrodynamics routines those are a bunch of lines of code but we only use like one or two functions from them uh and then we have a um just a library of our support code our testing code which I'll be talking about in the rest of this presentation we have a very small team it's four of us myself and three other students as well as our adviser so our time is limited uh also so none of us are really working on this full-time we're all students trying to work on publishing and our own research so um really anything we can do to move faster is is a great win I think for the sake of time I won't talk through these in detail but there are a lot of challenges that we face in designing this software here are some of them uh take a look um if you want you can ask questions about them later but uh I'll skip elaborating on them for now so that's an overview of visors I now want to talk about the way that we used simulation testing extensively throughout the development of this uh and what that looked like for us [Applause] so as far as software testing goes software environments go um spacecraft software is probably one of the most extreme disconnects you can find between development and production if we consider the number of factors and extent of factors that contribute to what exactly is the environment that software is running in on a spacecraft there's so many things coming into the spacecraft as inputs that software in theory should expect to be there and might rely on sensor measurements networking conditions connectivity with ground stations and other spacecraft Telemetry from hardware systems um and as well as often somewhat obscure processors and operating systems for space grade Hardware that if you're developing on a desktop computer and you want to compile and run and make some changes uh you're probably not going to have all of these things looking the same way as they would if you were circling around the planet every hour and a half on the other hand the responsibilities of spacecraft software are tend to to be pretty Mission critical or can be Mission critical they can affect the the lifek keeping of the spacecraft your ability to even maintain contact with it in the worst case um and if you end up bricking a spacecraft there's no reset button you can press um unless you design one in which you should do but um for the most part you know I'm just saying you want your software to work the first time if you can um it should come as no surprise then that simulation is used extensively in Aerospace also used in automotive really any any field where you're writing software that should be op out operating in some environment that has these unique characteristics so how does that normally look like in space software usually what you do is you take your system you isolate the pieces that are your software your system under test and you say we're going to replace everything else with the simulation and run our software based on inputs from that simulation that simulation very simplified boiled down ends up taking the form of just repeatedly stepping time forward and writing a model for what it looks like for the state of those things to step forward this is not your software this is everything that you're replacing everything else in the system that is not your software you want to basically write software that mimics the behavior of all of that stuff in visors we ended up using a simulation testing framework that despite simulation being ubiquitous in space applications you'd be very hardpressed to find a space spacecraft software that was not using simulation testing extensively I think there are three unique aspects to what we ended up implementing for our testing framework um those are I would say first extreme High Fidelity I'll talk about what I mean by that in a minute it's very deterministic we prize this very much um and it's much faster than real time uh our simulation basically looks the same as the diagram I just showed here's roughly what it looks like for us to compile it we take our flight code in libg nc.a uh a bunch of testing code in lib support. and we just compile them together and we get a single executable out which we can run um I will show a demo of running that and while that's running I'll talk a little bit about what's going on so this is a simulation case that we run all the time in slab uh it takes about 1 minute to run and it ends up running that full trajectory that you saw earlier so the visualization that I generated earlier was produced directly from a log of output of this simulation this simulation is running the full software GNC software on both spacecraft as well as the ground software that we're writing for our part in slab and doing so in a completely deterministic much faster than real-time way despite running the actual full as compiled flight software um the benefits that we get from this running everything in a single process and also in this case in a single thread is we have very easy debug ability we can run the entire system litter assertions everywhere and if ever anything breaks we can see exactly where we can step through calls between different like literal computers running different pieces of software uh within the simulation um and we also kind of interestingly get system level profiling so you could just run a CPU profiler on this software and look at what are the most CPU intensive parts of your entire system regardless of where they might be running if you actually deployed this on different pieces of Hardware so because we run on limited hardware for spacecraft that has also been indispensable uh for making sure that our algorithms run in the times we need them to on our limited space Hardware so we run that all the time this was about 48 hours worth of simulation of exactly what our our software would be running over those 48 hours in theory in theory um within about a minute so I'll talk a little bit more about each of these three aspects what I mean by it and how we achieved it first High Fidelity what does this mean for us there's two key components of our simulation being high fidelity that I think are very important the first is testing our real bonafed flight Software System under test I put this on here because in the space industry in my experience other places I've worked in the space industry often uh you have simulation testing but it's applied to a somewhat simplified model of the software maybe it's a prototype written in a different programming language like python or mat laab uh or it's using something like matlb simulink and then counting on uh automatic code generation to turn that into C code or something like that um that can work for simple systems but for something at the level of complexity of visors where we're very concerned with making sure we're sque squeezing every like time interval out that we can of our software and not wasting any time using outdated information we want to build our real software as it will be compiled in Flight in C++ and testing that directly the other one is model as many sources of noise in your simulation as you can there are many of these in our simulation you can imagine there's a lot of physical ones sources of uncertainty in the physical effects and forces on our spacecraft we don't always know exactly how dense the atmosphere is at each point in the orbit we don't know how bright the sun will be at any given time which affects the amount of swelling of the atmosphere and also just uh force from the momentum's B the momentum of photons bouncing off our spacecraft um there's also the non-spherical gravity of the earth so we model as many of these physical effects as we can I'm going to talk about one in more detail because I think it'd be most interesting to most of this audience which is for the network aspect of our cross link um we use our cross link directly we have a a datagram model for this it's lossy uh so we want to simulate this in our in our simulation and we do this by in our simulation code having a function that represents the act of sending something across the cross link and we can Define Some Noise parameters for it so we can define a minimum 3 Sigma and maximum 3 Sigma bound on how long it should take for something to be received over this um and we can also have a drop rate if we want we can make these numbers however bad we want if we want to stress test things and in this implementation we currently use a log normal distribution which is probably not the most physically accurate model of network latency we have not we actually don't have a characterization of how our particular cross link radio which is untested Hardware in space will farare at the 400 meters 200 meters so we are pulling numbers out of our ass but we can pick conservative numbers if we want to and once we calculate how long we want it to take and if we want to actually have this be received we can just schedule an event in our simulation to say okay at this time please run the code that whoever asked me to representing the receipt of this message when we use this uh we can use this from our simulation for the hosted software app that part that Georgia Tech is running which wraps the GNC library and in our callback this gets passed to those callbacks as a function pointer to the GNC library that represents sending a cross link message we can just get a reference to our peer because we're in simulation so we can access whatever we want and just say send a cross link this will model that randomized delay and drop percentage and at the other side we at the other side we'll just call the GNC in Cross link message input from our API directly that's a a bit of a look at some of the ways we model noise sources in our simulation determinism this is very important it's what enables us to really get all the benefits of debugging and traceability and worst case print statement debugging because if you run a simulation for a month of your mission and something goes wrong on day 20 you can just add some instrumentation and run it again and know that you're going to hit the exact same problem when you run it again on day 20 two ingredients to it make your system under test deterministic and make the simulation deterministic um I say these intentionally to say it's obvious but it does take work and intention to do that in our case for making our system deterministic this was something there's no magic here there's no secret sauce or like enforcement for it especially because we're in C++ um you could certainly use language tools to enforce determinism um by using a more maybe of more pure functional language but in our case we want the flexibility and the control of C++ and also the broad compiler support for a wide range of targets because of the diversity of space Hardware um so on that note with C++ you can intentionally violate determinism a little bit if you want so you might notice we don't provide a memory allocation call back here we do entirely rrap calls to Malik and we do use Dynamic allocation technically that makes our system non-deterministic but we're accepting that and we also have tight uh memory budget requirements anyway which we're monitoring for our mission so for all intents and purposes Malik will always work for us the rest of our calls however are designed to give GNC everything that they need and everything that it needs in order to be able to compute the same result every time um in particular call out a few things we make all of our networking and all of our scheduling very explicit so for example our cross link message and ground messages we could internally I mean like I said we're running on Linux we could open sockets internally inside our software we choose not to because we want to make our our software's Behavior a pure function of what we pass to these function these input calls we don't have to do that but it's desirable to it allows us to have this complete determinism and ease of testing same thing with tick we basically use this to replace calls to the operating system scheduler so instead of where we might put a sleep call saying I want to suspend my thread please call me back uh in 5 seconds to the operating system directly we create this as an output from GNC and leave it up to the client to Define what that means in the case of the simulation side determinism use a seated random number generator uh we use the C++ built-in mercen one uh you can do this in any environment presumably um again no magic sauce just be disciplined about it and make sure that you don't end up uh using a non-seed call or introducing some other source if you do you'll notice it track it down before it gets too bad and eliminate it so lastly the faster than realtime aspect um we do this using event driven simulation and built on top of a broadly event driven system so we created GNC from day one to be defined by asynchronously arriving inputs the uh hosted software app that Georgia Tech creates is free to call us with sensor measurements received messages at any time that it want we describe that in our API to them we would just like them to call us as soon as they have any information really so we don't want to impose any restrictions of like you must call us at this rate um if they do that and we perform a little bit of work in response to all of those we have a lot of empty time in between when we just aren't receiving uh like no nothing has come in on the network or a given sensor has not produced any measurements so we can just get rid of all that time in simulation and Skip from one interesting thing to the next and we'll still get the same result we do this by implementing a very simple event Loop in our simulation so so in our top level Sim object uh we have uh just a priority cue of events I think whatever we have a priority que of events uh and we have that schedule call which you saw earlier when I implemented the cross link delay um where we can just take some callable object in C++ and we create a timestamped event object with it throw it in the priority queue it will be sorted by the time that it should Ur occur and in our run Loop this is the top level run Loop of the simulation uh we just repeatedly DQ the next event because we want to run a physical simulation as well we compute the amount of time to the next event and run a step function to say please propagate the true dynamics of all our analog stuff by that amount um if you're just doing software simulation you might not need that analog side and it could just be a plain event Loop like you know like what powers JavaScript or something where you're just running a sequence of functions using using this technique to skip all of the empty time in our simulation we can run a 30 orbit simulation in 1 minute which is about a 2,880 time speed up um this depends on your Hardware but this is just on my computer so this is also like I said running completely single threaded so there's no reason we have to run the simulation single threaded it's nice for debuggability um and for Simplicity we've kept it that way but if you were trying to simulate many simultaneous uh hosts running a bunch of different software 100 nodes of something you could parallelize your simulation and just make all those run on separate cores because I mean they should work that anyway because ultimately they're going to be on different computers so that's our faster than realtime simulation between all three of these um these were really I feel like indispensable tool in allowing us to move very quickly um and just make changes and instantly see how they impact our entire system in a very controllable way that's a look at simulation testing as we used it for visors third and finally I want to talk about some takeaways that I have for applying these techniques to other systems not necessarily Space Systems I uh software testing for a second uh I'm going to throw up two we can dispute what exactly each of these mean but uh let's just say think I'll know it when I see it uh two extremes of software testing um and I'll just say unit testing is that kind of thing that you do on functions and little pieces of your program and they run quick and they're like a little easier to work with and then endtoend testing is what you do when you have those scripts that you want to stand up a bunch of services and then sometimes it works and sometimes it doesn't but it's important to do uh so between those my thesis would be that I think unit testing tis a lot of boxes much better than an0 testing with two important exceptions which is that usually it does not test the whole system and there's a lot of details um that the devil can be in uh when you're testing the whole system so I would propose that a simulation t a simulation testing approach like what we have for visors does a pretty good job of hitting both of these strengths there's nothing that prevents us from applying our simulations to smaller components and we actually do that there are multiple smaller subsystems within the GNC comp component that we have our own simulation tests for for example running the navigation side independently of the control side to just test our estimation or vice versa um so you can really drill down as much as you want if you want to test a piece in isolation but we have determinism we have uh encouraging good code structure by making these interfaces explicit um we have debug ability easy to run them automated they run very quickly and we can test for the most part the entirety of our systems there are two related past strange Loop talks that are very like relevant to this kind of approach of testing software that I want to mention um the first is a strange Loop 2014 talk uh I was talking about it uh to some people earlier today it's a great talk uh it touches on a lot of these same points that I've mentioned um that's from Foundation DB uh and referring to a C++ transpiler they've written called flow that they use to take imperative looking code in C++ turn it to a turn it into asynchronous code and run it in a very similar kind of event-driven um artificial simulation environment where they can completely model all of the network behavior and dis access for their database uh I want to draw a few contrasts to that where I I think that there's a lot of great stuff in that talk but I think something it misses is that I don't think you need a new language I don't think you need a transp or a big framework to do this kind of technique um you can get away with being intentional about the interfaces in your application and using pretty lightweight uh event Loop implementations things like that and modeling your domain however is best for you so uh I think that there are probably many teams that maybe feel like simulation testing would require adopting a big framework when in reality it might not and the other one is also from strange Loop 2014 and it's just called simulation testing by Michael niggard um and in that talk uh he talks about again very similar to the kinds of things I'm talking about here uh his talk is in is in closure and he mentions a few libraries that they use to do this kind of dynamic input generation model a simulation of uh traffic to a service and test the entire flow of their system in response to that traffic um again something that I think that is maybe missed in that talk is that that you can do this I think you can do this in any language you can do it in a systems language um my take away from that talk simulation simulation testing is it feels a little bit like the the functional aspect of closure is very important for enabling the kind of testing they do and again I would contest that and say it again just depends on you being intentional about making explicit all of the input and output from your system you can always make a pure functional system out of a language like C++ um the of C++ is that you can take little shortcuts here and there like I said we we decided to skip Malik Malik is fine with us so the biggest single piece of advice takeaway that I would give from my experience implementing this in visors is pay extreme attention to the boundaries of your system what do I mean by that if our software is our GNC libraries and a little bit of ground software the most important thing in how well we can test our software using simulator is the design of our apis for these if we look again at the API for GNC I think there are three particular things that are somewhat unusual about this API design um those are the absence of uh networking scheduling and like file IO that we make explicit here that sometimes uh in I could imagine other apis they would just sort of be invisible they'll happen for you we'll just do them with the operating system which seems convenient but it has drawbacks so to call these out in the case of scheduling we have this in tick and out tick function what is this uh we call out tick as a call back from GNC whenever we want to do some work at a later time and we just Define it's part of the API it's part of the interface that whoever implements this whoever our client is whatever the user is they better call us back at that time in the future on the one hand this makes more work for them on the other hand it makes it very easy for us to test our full software as built in a simulation environment faster than real time this ends up really being equivalent to what you would normally doing be doing with any other scheduling kind of interface whether you're calling sleep and putting something after it or calling a set timeout in JavaScript and passing it some code to run in both of those cases you're calling some system saying I would like to be called at a later time please jump to this code it's a little invisible in the Sleep call but you are passing a pointer in your return address uh and then the scheduler is just jumping back to you at a later time to give you that control again so all we've done with our API though it looks a little unusual is Make That explicit um on the network side we have our input cross link message and output cross link message you already saw those earlier that's very similarly often that would be invisible in an API um internally you might open a socket and say please send this packet I'll ask the operating system to tell me back that's that's fine it makes your API convenient but by talking directly to the operating system you make it much harder for yourself to test your software's behavior and your software's behavior alone it's almost like you're making the entire operating system a part of your software's behavior and lastly file access we have a few calls that are kind of equivalent to US reading from a config file in practice our configuration will be stored in non-volatile storage on the spacecraft we could have read that ourselves but we didn't this makes it very easy for us to test deterministically when we output Telemetry we have a call back for it we don't write to a file ourselves so by removing all of these things from our system this is how we achieve the desirable properties that make them very easy to test deterministically and quickly and when we wrap them in a simulation we can interface with all those removed removed pieces and basically mock whatever those Services would have been so at the end of the day I've said this a few times I don't think there's anything super magical or fundamental going on here this is just follow all of the testing advice you've ever heard um but at the level of the whole system as opposed to individual functions so before I started on this project I had not ever really seen this done super concretely before I didn't really have a good feel for what it meant to say you know remove these dependencies at a system level I'd seen lots of examples of dependency injection mocking whatever for functions classes but it's not so obvious what it means to do that for a whole network or a file system or AER now one might say Let me let me gather my thoughts Toby of course it's easier to test now you've removed all of the important stuff the networking the file system access the hardware interfaces you've removed all that from your system but that stuff still has to exist somewhere and now it's just going to break outside of what you're testing I I had to rehearse that a few [Applause] times uhoh sorry from your system but that stuff still has to exist somewhere so now it will just break outside of the what you're testing fals um okay my response to this I I think there's some truth to this but I think it's not a complete criticism of this technique for a few reasons the first is uh we've done it so as part of our testing in the space rendevu lab of our software uh we've done some processor in the loop testing where we take the GNC component and run it on a Raspberry Pi why because it's what we had lying around and it's arm 32 which is much closer to what the flight computer will run than my my laptop um and we just put a very small serialization uh library in between the API on both sides and put a executable around it that just opens a socket and reads data over a TCP socket and calls those functions in GNC and wraps the callbacks from the outputs of GNC and sends those back over the TCP socket and uh this was for all intents and purposes a very small piece of code and also subjectively it was really not a pain to write this code or to test it um none of the flakiness in the system came from anything that we wrote here we wrote this and after maybe a few cycles of debugging like okay I I got that argument to that system call wrong uh it it worked fine the behavior that all the flakiness that we're testing had already been debugged previously from running our software in the loop tests with our modeled noise and our cross link all of the other unreliability sources that we have in our simulation and once we got this running on a separate processor it ran perfectly fine so based on that experience I would say that even if you're taking that thin layer of interfacing with an operating system and moving it outside of your system I actually don't think that's really where the complexity is if you've defined your interface in such a way that you can Implement that simply on top of operating system interfaces you'll probably not have too hard a Time the other reason is uh ultimately you can do a little bit of end to end testing I mean you can do a lot of endend testing you can do as much as you want the point of simulation testing is to be used as maybe a a more common default something can run quicker day-to-day I mean we just use this in place of unit testing a lot of the time because it it tests with more coverage than we'd get from unit testing and it is so fast for us to run we can just test our whole system in a minute yeah okay so that's my biggest takeaway uh is design your systems with interfaces to remove the internal stuff that you might be used to just reaching directly for an operating system API for and make it a very concrete part of your system boundary and once you put that actually running on production you can write a thin little wrapper around it that connects those two but in simulation you going have a really easy route in to make everything really deterministic and fast I think that's about what I just said so I won't repeat all that so those are a few of my takeaways um in summary uh visors is a very exciting Mission and a very complex mission that I don't think we would have been able to develop as well as we have um and with the amount of confidence that I feel in the software after seeing it run and seeing it misbehave and crash many times uh over unreliable network connections and all that and having debugged that but in a way where we can run a simulation the exact same way over and over and figure out exactly what is going wrong when something goes wrong and fixing that um these techniques are not just applicable to space though space makes broad use of simulation uh I think it's something that could benefit a lot of especially distributed systems but any software and uh I think the key thing for enabling that is intention about making all of your input and output explicit to say it for the hundredth time thank you very much that's my [Applause] talk
Info
Channel: Strange Loop Conference
Views: 4,380
Rating: undefined out of 5
Keywords:
Id: prM-0i58XBM
Channel Id: undefined
Length: 44min 44sec (2684 seconds)
Published: Thu Oct 05 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.