The TTY Layer: the Past, Present, and Future (Greg Kroah-Hartman, Linux Foundation)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
let's talk about the TTI layer because the TTI layer is a pain point for embedded uh infamously Nicholas Petri uh 2015 sent out a patches to rewrite the TTY layer to strip it down make it tinier because he was worried about Linux not working in tiny devices turned out it didn't really work well and I'll go into a little bit why but um if you're an embedded device you use and especially if you're a real-time device TTI layer is a pain it's really really bad um but there's great talk or great paper or website uh describing the user space side of TTY where it came from it's really based in the 1800s um the whole idea of a a terminal or a ticker stream and the history involved and where it came from and how the job controls work and it's crazy the TTY layer is really insane lots and lots of Legacy um again 1800s um but that talks about the user space side let's talk about the colonel side so Thomas gner um said maybe about three or four months ago at a real-time conference um the TTI layer is the biggest pain point for real time good luck ever fixing it uh there's a whole talk about why our article in Linux Weekly News That's public about why goes into the issue so I was like aha Thomas you're wrong um let's let's I'll show you how you can fix the t2y layer um spoiler I failed um but this is the thing that people don't realize TTY layer and serial reports is why all of us are here today Linux succeeded because we have a cty layer that works Linux got successful with isps with serial ports with modems and all that stuff people brought it in the back door it solved a real problem and it worked many many operating systems can't even handle serial ports at a rapid speed her famously couldn't go over 9600 bought um TTY layer proves that micr kernels is a bad idea um it's proof real world proof um but because of that and because of the great work that lenus and Ted and Allan Cox and others had done and to make this crazy complex uh infinitely flexible infrastructure Linux succeeded and we got to do Linux and make it go into other things and really cool stuff but the basis of it is still there this is some of the original code in the one. O days is still alive in the TTY layer today um it's crazy but it works and let's try and explain it a little bit but this crazy and complexity I'd like to say this is why we're here so we all complain about it but it's still good stuff so there's different things there's a TTY layer which is I'll talk briefly about it then there's something called line disciplines which talks about the data that goes across that line and then there's think I'll just call them serial ports there's lots of types of Serial ports but just think of it as a uart a chip that's going to spend out serial data out the other end receive serial Port serial ports can be like fake ones they can be um USB devices which have another Ur on the other end of that or a fake USB device on that they can be network devices you do all sorts of things but I'll just talk about it as a serial Port because that's what we're used to an embedded world and serial ports have been around for a very long time my very first paid job out of University in 1992 was working on a Serial port for PC um and a 286 processor and embedded devices I'm still getting paid to work on seral ports today uh they're not going away my entry Point into Linux itself was to write the USB to serial layer and drivers so I've been doing serial ports for a very very long time um and they're not going away despite what everybody thinks Microsoft tried to get them to go away with USB it didn't really work and there's some problems with that but we still have serial ports today as you guys know on all your devices um and we're going to ignore consoles um in the especially even in the embedded World consoles are kind of magic they are a Serial port and they are a TTI and they're attached to line dis um and you kind of want some get some data out of them to look at the line just don't worry about them if you write a Serial Port driver you get the console work almost for free so it's just I'll ignore them for now but there's a lot of complexity involved in there and that's part of why this TTY layer is crazy but anyway and oh please interrupt ask questions otherwise this it's more fun for me uh TTI it's a Char device character device from user space you open it you read and write you can a line discipline which talks the protocol you're going to talk across this stuff and you do ioctls you can set the B rate you can set the line settings you can for a normal serial Port you can do all those types of things you can do all sorts of other crazy things talk about Echoes um fake flow control real flow control change the flow control characters all sorts of stuff but from user space open a character device read write ioctl close simple easy there's a TTI layer that handles this stuff then there's the line disciplines and line disciplines is this is the protocol Linux today has at least 20 I think there's a few more tucked away in different spots are the normal ones called nty for normal TTY this is what we all think of for a normal serial port or for almost for a console want to log in do something but then there's pseudo consoles um that'ss are weird and there's virtual consoles and there's GPS slip Network protocols can the can bus has a protocol for this type of stuff Ham radios PPP the really there's a video phone modem control buried in the kernel for this for line discipline works really well there's also a cool we buried it in raw um Rob Herring did this work to make an internal serial Port connection because it turns out the konel some driver subsystems want to talk to other driver subsystem through the internal serial Port well we made a line discipline we hook up the two ends and we can talk directly with through the kernel that way it's really nice hack works really well lots of people more and more people are starting to use it um and it works on any t T2 device so any serial Port any device any user space t2y device there's hundreds of different names and whatnot they works for anything so line disciplines are a pluggable infrastructure type of that protocols think of it as a networking protocol but they do cool stuff um GPS um does some really weird stuff but anyway they're in the carel and then the Cort driver so think about this is like your art your old 8250 the 8250 driver we're still you think I mean it's a free verog somewhere right people are still making new uarts I just we just got a new uart um driver sent to us for the esp32 I don't know some weird stuff Samsung made a new us uart every hardware company wants to make a art please stop making new uarts this is really really old stuff this should be well known again 40 years 50 years these have been around just use the traditional one and go but the 8250 it can do dma now it can do bit banging it can do everything we have support for it all it's it's a complex little beast um to want to write a new one I don't know why you would want to do that just use the driver we have and go it's not it's like what 60k the uarts are not tiny um I use the uart driver as a unit of reference in the kernel people say oh they're adding new bloat to the kernel all this new subsystem whatnot I'm like Ah that's like half of a serial Port driver they're like oh so it's a good use reference so 60k for your U driver um hence Nicholas's idea to make it slim down and tinier turns out he still needed that 60k to talk to the Arts um it's complex we can pair it down some more and I'll talk about how to fix that later but anyway there's tons of them ISDN is still alive and well uh the German Railway Network uses it um in real devices so we keep trying to strip us ISDN out of the kernel but no there's an active maintainer and it works somebody the other day asked me how many people do we need to keep a driver in the kernel one One driver one maintainer or one device um that one happens to be the German Railway Network um but it works pretty well and there's about 40 different serial Port drivers and then there is a Serial Port serial Port layer which has a tons of uarts hanging off of that so it's a complex thing USB serial subsystem I think we have about 15 20 different chips than that as well some are real uarts some are fake uarts some are bit banging over USB some are Lex Beast so these are all these Fade Out and Fade Out and Fade Out you don't really know how many real drivers are out there there's a lot but just for the Ser TTY part at least 40 add another 50 or 60 per subsystem it gets big there's a lot Linux supports everything which is awesome and here's a cool hack um a lot of people don't know about this you want a good example of how to write a ttii driver in a nice simple way serial Port driver it's not really serial Port it's just print K so you open this character device up you write characters to it and it shows up in your colel loog it's actually really good for debugging it's really really handy for early boot uh really good for USB or for um device bring up and stuff I don't know why more people don't use this it's also a great reference it's 200 lines of code really well commented very simple please look at this stuff it's CR hack I like seeing this stuff uh more people should copy this stuff um and it shows that you can the TTY layer is very big and complex and honorous but you don't have to really know it to write a driver to work for it again example like that the old Linux device drivers book Third Edition um has a chapter on how to write a TTI driver and how to write a Serial Port driver uh that's pretty much remain the same and it's not that complex um to write a working one despite how complex the core is anyway cool subsystem cool driver so now let's get into why this is a mess um this is the start of TTY struct every ttyy device character device has one of these one k ref which is a reference counted object there's an atomic variable in there we have at least four mutexes one of them emulate the old big kernel lock uh one re white semaphore and one spin lock I think we have enough and it goes on we have two aqs two work structs um and then there's two internal structures each of has their own spin lock and now I wish we could get rid of alpha they're properly aligned to get a 3 64 bit not bite bit right on Alpha and this gives you a hint as how well tune the TTY layer is we're worried about writing 64 bits in a in an atomic way without using a lock all these crazy locks that we have in these mutexes and the rewi semaphor as you as I walk through and show you you how the data flows to the kernel um pay attention we use it in ways that are backwards these locks are like grabbed the wrong way they're grabbed in fast ways they're done with some weird Atomic stores and some S&P buffer flushes if anybody submitted this code today everybody in this room would scream and say no way um but it works um this is a big I we actually I finally looked at this the other day before this talk we kind of compressed it it was had a of padding so every TTI structure has is at least 656 bytes big um there's one hole that are still in there because those padding for Alpha if we get it rid of alpha we can make a little smaller it's a big structure but you don't yeah you do open a lot of TT and and big servers whatnot it's a mess it's crazy all those locks add a complexity and all those locks add non-determinism which Real Time really cares about real time as you know is not about speed but it's being deterministic I Rite a bite to that serial Port I want to know I will get a return or it'll get out the other end in x number of milliseconds or microsc or whatever the ttii layer is not deterministic and I'll show you all the reasons why so this is why the realtime people hate it and rightfully so you shouldn't be doing real-time devices with TTY I see people using real-time devices with USB to serial over them controlling robots that's even more insane because USB adds additional uncertain latency USB to serial adds an additional latency and it's just bizarre that it even works stay away from the TTY layer if you care about real time then there's the driver and the driver you would think for something small we have 36 different function callbacks that's a lot you don't have to implement them all but that's a lot that gets an idea of how big and how messy this can be it's very well documented it's nicely done which is good but there's 36 function callbacks and then there's a port so every one of the TT you open up you get a port kind of almost thinking of it as a c Port they're almost one to one sometimes they're not one to one but it works out and inside that Port there's even more locks so the TDY device has a whole bunch of locks the port has a bunch of more locks one reference count as well more weight cues three locks I mean come on um you think we'd be able to consolidate some of these but these are added there to make things faster these locks only control and touch certain portions of the structure so that means that data flowing through doesn't have to deadlock makes things fast and then there's the uart port and the Art Port is what a Serial the serial layer does so think of this I don't know why they weren't called just serial ports we do now have a Serial Port structure but I'm not going to talk about that um these are art ports and there's 27 of those callbacks and only one spin lock but then all the uarts have a global spin lock so if you ever start doing line speed changes or anything else for any serial port in the system they all could potentially block that real time guys hate that but we have to do that in order to control Hardware properly sometimes multiple uart control or one uart controller controls multiple devices on it you have to block them all anyway again not deterministic and then USB serial driver I originally rate this Johan I think oh he was here last week or earlier he now maintains the stuff I a little smaller only 38 function callbacks um it's a mess but there's a lot these are complex Beast you can get away with smaller ones but they're there for because you read it uarts are not tiny little things and then there's USB serial device again I embed another TTY port in those locks and then there's another lock within that and then there's a struct device which has locks and other reference counts within that it's like turtles all the way down there's so many locks here um it's deep so um let's talk about the data flowing through the kernel it's a good idea you shouldn't usually care this is what you care about so this is how the data will flow into the kernel from user space you want to write a device out the serial port and let's talk about how the data comes from a Serial port and gets to user space um it's not obvious I know many many people ask about this all the time in the mailing list I asked about this when I first started getting started with Linux it's not intuitive but in doing so I'll show all the places in the kernel where the layer is not deterministic and why and this is a fun thing um let's not talk about the console TT right write is you open a device node you write to it or consoles can do it or there's other ways within the kernel that can open other TTY devices um because of pseudo terminal and other fun stuff and then we it isn't a normal traditional buffer actually using iterators properly um make them a little faster so we have the iterator these are now how read and write callbacks in a kernel look like they get a kernel iocb and you get IO iter um iov iterator and then you iterate over the blocks inside there and suck them in and out it works really fast this came from the Block layer um they're now in all character devices you don't have to do it in a character device but I strongly recommend that you do um it's more complex and I'll there's some some Hang-Ups in certain places um but it works well so you don't see a traditional buffer we'll find out how we get to that buffer so inside we finally get down in the kernel from that right to the real work and we call iterator TTY right and we grab a lock this TTY right lock is a big heavy lock and we do something really odd here we first try it hey did I get the lock yes no oh then let's try really really hard to get this and I'll sleep on this did that and I might timeout there's some timeouts in here that's pass as well did that really work no then I'll fail and then you I say go user space restart this over again but there's that's two tries of a lock at the same time one after another again that's not very deterministic the real time guys are like whoa what's going on here um it's messy but it shows that we really try and grab the lock we really don't we really want to make user space a little bit easier but then again it has to handle restarts no matter what cool right luck then go back in the thing we check the buffer that's given to us based on the iter Ator and here's the first mess if it's too small we'll just allocate more um because this is user space data so that is very non-deterministic imagine you're sending 10 bytes 10 bytes 10 bytes 10 bytes a thousand bytes 10 bytes 10 bytes 10 bytes well the Thousand bytes is like oh that was bigger than the last buffer I just saw so I don't have enough room for that let's go call K Malik and let's sleep and let's spend some time and I really want some data so please give it to me and then I I'm GNA yeah give it to me and then I'll go free my old data and then I'll KV free I mean that's even worse than normal V normal KF free this is a non-deterministic mess this is not good from user space if you want to start writing data and you want to be semi- deterministic write the Same by size all the time but that's not always under your control because you're talking to devices whatnot again non-determinate message number one we're going to try and count these um then we pass the data down to the line discipline and I'll talk more about that we copy copy from user space copy from users space c depend sleep on some architectures it can fault can cause page faults it can page stuff in and out x86 is actually really good about this doesn't do that arm 64 I think is better hopefully better but it's pretty good arm 32 does some weird stuff x86 32 can do weird weird stuff um anyway and then we keep on going let's talk about the iterator right this is really fun this has been there for about five years um we think we got the iterator right and the logic right and if we got all the data out of it otherwise we're going to revert and say oops we didn't do all this stuff and push it back into the buffers um nobody's ever caught this it seems to work I think we can remove the comment um alvero would be nice to get him to check this it's pretty funny I think Al wrote this comment too um I didn't look back in the history and then if you saw Thomas glier's talk yesterday um if not go look at it online it talks all about the preemption model of the current colel how things work properly he calls out a pattern and the kernel is being very very bad this pattern we're in a loop user space gave us some data we tried to write it and then exstensively we don't want to sleep in the kernel we want to make sure that other things can happen so we call hey is anything happening is there a signal happening wonderful if there is then we get out of this thing and then we'll like hey should we be rescheduling or not yes no maybe so let's try it and this is a huge hack um Thomas rightfully says we should not be doing this um and he has plans to clean this up but this is hugely non-deterministic and will let other processes become a little more deterministic but the process that was doing this writing now just got interrupted and it's rescheduled even though it had some more work to do and it could have done that work maybe but other things happen very bad pattern calport does this or TTY layer does this anyway then we're done and then we keep on looping we Loop all the data we that was sent to us and it's all pass to line discipline we're good we'll return to user space and we're happy we'll unrelease some locks and then we'll do something fun call we update the time we update the time on the device node because in traditional Unix and the posix rules if you write to a device node the time stamp should be updated as the last access time or last modified time um turns out that's a security hole because if you can if you can watch the time stamps go through remember your passwords or your keyboard goes to a pseudo terminal device TTY device you can detect the data that's passed through T Char node by looking at the access times based on the bits that are going through it's crazy it was an exploit a long time ago so we don't do that anymore and TTI uptight time actually again non-deterministically goes out and grabs the real time from the system um some architectures most 64bit ones are good the time is just a variable and it ons away some embedded platforms this is very expensive and you got to watch this out again not deterministic you're just wanting to write some serial bites why are you hitting the time chip this is why now we grab another lock an independent lock we iterate all over the all the different ttii file descriptors that happen to be open again not deterministic you don't know how many are open closed for the specific one and then we change it all but we do we do a gradient of 8 seconds we took a guess a number of years ago when this security bug came up figured out you can't really detect any logic if you have a 8sec window so we do some fast bit map math which is actually that's the fastest part and then we release the spin lock but again not deterministic oh I didn't even write that one down so that's number two um and then we unlock the TTI device everybody's happy we go on so let's talk about the line discipline because remember this is the TTY device pass it on to line discipline and for the line discipline I'm going to talk about the normal one the nty one and this one traditional pretty easy we're going to Loop over all the data was given us and then since we're going to write some data we call down read we're going to write data so we call read um this is the first and a long line of use of a read write semaphore backwards down read it's really fast to grab a read WR semaphore as a reader it's very slow to grab it as a writer the TTY layer uses this semaphore in a very very tricky way to grab reader locks in fast ways because we just know it's going to be safe very scary stuff but then the thing the the line discipline can do Echoes you can do Echoes for different protocols and fun parts of that so we'll process the echo characters we don't know how many let's process them all again non-determined mess number uh two should be three um we don't know how many so you're going to process a bunch of data even before you started sending your real data then we're going to add a weight que we're going to check the pending signals again if any pending signals are there we're going to abort so the data never actually actually got down again not very deterministic nice and then there's something called output blocks and TTY layers we're going to process all those as well not only the echo characters let's process all the output blocks before we get to your real data again on determine number three which is four um still we haven't got to our real data then finally let's grab a lock we had a read lock let's grab another lock mutex lock and then we'll call down to the TTY driver and I'll talk about that a little more then we'll unlock our mutex and then we'll up our read lock because we were writing data we upload a read lock anyway very tricky very turns out to be pretty fast but we got some locks we got we're two locks deep in the TTY layer or one lock or one lock deep in the TTY Layer Two locks deep in the NTT y layer and let's talk to our cial porch oh first off we'll wake up the weake que and then we'll down read again because why not jump back to the top of Loop and then we'll upre some other time later remove the weight Cube and we're okay um that handles our loop the loop is written a little bit backwards Just For Speed um it's very tricky but it works but again these read white read write locks use backwards mutex is fun stuff let's talk about the t2 dryer so just do the serial Port every thing we're all used to that we grab another lock the UR Port wants to grab another lock and then because we have a data in one buffer let's copy it to another buffer um but only a page size big because we don't want to have too much memory St in there but so if they give us more than a page size we have to do this multiple times great um you think that we should coordinate these a little bit better turns out we don't really send a huge amount of data all at once and I'll show you more on the read path it's even worse um but page size seems to work pretty well that way we're not wasting a ton of space and then we call down to the real Chip you art send after that okay so we're one TTY lock two n TTY locks deep three locks deep here mem copy some unknown time and then we're send and then in the uart start tilling with the PM bits because we don't want to go to sleep because we're actually going to talk to Art so then we talk to our real Chip down below that an 80250 send and because we like it we're going to tweak the PM Flags again PM Flags luckily are reant so it's okay but we tweaked them again it's not really locks but it's messy messing with things and then the uart will go out and read the LSR from Hardware Reading from Hardware takes who knows how long we'll do it anyway we need to know this data before we can actually start sending stuff so that's a non-deterministic mess number well five um some chips read it fast because just support reads in memory some go over dma semi fast but you got to wait for the dma loop to happen some chips go over USB and they have to come back along that it's a mess again again non-determined we're just still trying to write our data haven't done it yet and then we'll start writing our btes to our data our uart itself some uarts one bite at a time we wait we write in a small Loop some we set up a dma buffer and post it all off and it'll happen later which is nicer and some other ones yeah the the PM runtime G is can be Asing so that also adds one more to your non-deterministic mass counter what the the PM Flags no the PM runtime git in the previous slide can also have asynchronous call backs which adds One More Level to your oh it does oh God all right it's even worse I I didn't even catch that one okay all right yes it's even worse than I think um great please interrupt me I could I I lots of people know this better than I do um this is where things finally get out to a uart fun part is if you're like talking some uarts there's an internal buffer in the chip itself and then it'll spool those out to the line what whatever so we've had all these non-deterministic weights up to the chip weight and then the chip can do whatever it wants to it can sometimes do it but we need to flush all the data out to it so sometimes we sleep and wait till it gets all out there all at once again very non-deterministic and I'm not going to count these numbers anymore it's deep and then we unlock the your part great finally and then we're done data's out it's all to the device everybody's happy four or five blocks deep X number of non-deterministic Rights before user space gets back that it had finished so when you're a real-time device and you had a user space process that wants to write some data out to a uart you can never guarantee when that process will return that's not good real-time people don't like that rightfully so um don't do that don't mess with real time and newarts um that goes against everything that we do and embedded but this is this is a solution um and I'll talk there's a few more solutions at the end so cool that was right that was the right path that was the simple one um let's twist the other way so when the hardware gets the device because it's interrupted these come in unknown time you have an interrupt you have an ISR you have a herb Loop for USB you got some other things happening um the driver will call a function that this is this shows the age here we'll call if you want to write just one character called flip Char or flip string if you have a bunch of characters traditionally we should all call Flip string because you have more than one bite this goes back to the original T2i layer we had these things called flip buffers where we writing to one buffer it's much like a flip buffer on a video screen you write to one and then you flip over and then you can write to another one you have two buffers happen at one time we don't have that anymore but the idea the naming is still stuck someday we'll clean this up usually my uh co- maintainer for the serial layer and the TTY layer is going through and fixing up a number of these names and cleaning up good stuff stuff maybe we'll get around to this one anyway you push it in and then you call something flip buffer push so you write a bunch of data to something and then you say go do it but you don't really know what happens here so let's talk about what happens so when you call insert flip whatever uh they both go down to the same part first thing you do is do we have enough memory no let's allocate more we could be an interrupt context so we cannot fail we can sleep for forever we can spin we can do all sorts of fun things that we do not know that's a huge non-deterministic mess no driver checks to see if we fail um and the max we can allocate is one megabyte luckily we don't usually overflow if we overflow we don't care that means there's no readers but this is a bug that's in the kernel for the past 35 years 30 years and nobody's noticed um so it must be okay but it's kind of scary when you see this we'll get bug reports from the from static analysis people saying you're not checking the memory allocation I'm like yeah can you test it they're like no okay um and then we Loop we copy all the data but that allocation think about that that can sleep and that's a huge non-deterministic amount of time so if you're receiving data from Hardware that data getting into the kernel so that user space can access it is going to be delayed by some unknown amount of time that we can never guarantee that's bad you don't really want that anyway um TTY buffer page is the biggest size I think we're talking it's bigger than a page I don't remember what it is somebody can look that up a couple megabytes um it's a defin that's been there for forever and then we push and here's where the magic really really starts to happen um if you ever read the deriv Locking model um documentation it references a whole bunch of functions that you should never call that's one um we start messing with cach lines and we start messing with cash line buffers and these fun things called SMB store release and a couple other ones whatnot this is all really really black magic it's perfect examples of things you should never do but it works so we call this means we're flushing some cach lines we're relying on somebody else who's actually looking at that store somewhere else to look at this and we're really implementing a ring buffer but we're implementing a ring buffer in a very hand tune ridden way because this was the first ring buffer in the kernel it wasn't it was written way before the ring buffer logic is today ideally one day we'll go back and make the ring buffer code um use this use the real in kernal ring buffer code instead of this uh I think what Stephen how many ring buffers have you ridden four three in the kernel three or four or five here's another one here's the original OG one I'd never use this but anyway and then we'll wake up a work CU so your interrupt message um came in and wakes up work Cube because you don't want to do all the work in interrupt context but the data still needs to get the user space so you wake up a work Cube and you can go back whatnot we don't know how long the processor can the scheduler can make us up its mind do something later and in the work Cube let's grab a lock why not one lock per Port so we kind of make it paralyze but it looks a little better and then we do because locks aren't good enough we do an atomic Reed why not and then we call SMB load acquire twice make sure that cach line is really there no it's for looking at different variables but it's really really tricky and then we call the line disciplines receive buffer function which I'll talk about later and then we do the thing Thomas says never to do we'll reschedule and then if you had a big buffer we need to push it all up there but you could have slept some unknown amount of time we're going to loop again and then we're going to finally unlock and this again is totally non-deterministic because we had sleeping we're waking up at someone point in time we're going going to do this we could sleep again and we're going to keep flushing all this data out a nightmare actually works but it's a nightmare so in the line discipline again we're writing data so let's grab a readlock the wrong way um and because we know what we're doing to match with the other read loocks we will call SB load acquire which actually was tied to a load release somewhere else um and then we'll finally copy it into our local buffer and this local buffer actually is good we allocate it ahead of time we're not going to have to allocate at runtime well while the data is flowing through the system it's good we know here's the amount of data we're going to do for once non it's actually deterministic well actually load release now these load acquires and load releases um different CPUs can take different amount of time they can be longer they can be shorter depending on what's happening in the system again not very deterministic but we'll live with that and then we'll wake up another w q and then we'll up the read that we're doing the down or again read locks I'm all my data I'm going to show you there's no we never do a right lock on a read right lock right locks are on the other paths where you do configurations and whatnot magically this all works and then the data is in the line discipline so user space needs to get the data so it'll call read on the t2 Port hey no locks for once but we're only going to give you 64 bytes and we'll put it on stack um wants to be fast but if you're reading a lot of data from a Serial report you're guaranteed your fastest chunk you're ever going to get is 64 bytes for some high-speed line transports that's pretty slow I'm amazed nobody's ever complained about this um then we'll call into line discipline read and then we'll copy the data into user space buer which can fault again not deterministically and then because some people found some really really interesting bugs and whatnot and passwords are in this data we will Zero out the stack buffer um that was a neat hack to cause us to have to do that that um people figured out how to read passwords out of a running kernel by looking at a specific point in memory um exstensively if you're able to read kernel memory you can do lots of other bad things um this mem set is really free processors copy data especially setting them to zero really really fast we're only setting 64 bytes cach Line to Line it's fast so we're like ah this be nice we'll just make this real fast boom um and it took out a whole nasty bunch of exploits which was kind of fun but anyway you'll see this you'll kind of wonder what happens because after this we're done with that buffer we never use it again this is why we do that um so let's talk about the line disciplines read again we're reading so we call it read we'll mess around with the SM the cach lines again and then we'll mem copy from the flip buffer that we had before that was our noal buffer and we'll copy it into the stack buffer but we can only copy 64 bytes so it's a tinier chunk and then we'll adjust some magic pointers for the ring buffer stuff and then we'll call TTY audit ad dat so the audit subsystem is this really big subsystem if you look at some Dro kernels that's enabled you see all these auditing messages it's a way they want to track all data that what's happens in the system one thing they audit is all TTY data and all TTY data include your passwords odd but they want it anyway and then they'll call U reads sometime later because it's a weird way we do a loop and sometimes we'll re-enter this loop with the lock held and we know held so we're okay and then we anyway it's a really interesting amount of spaghetti code to read if you're really bored and want to fall asleep um let's talk about TTY ad data so more about that it will allocate a buffer that can sleep and it'll just sleep for forever until it gets the data because it really really really wants to audit that data really want to do it so it has a copy of this data let's allocate another buffer and then we'll grab a lock after we have the buffer we'll copy it and then we'll write out to the audit log um so there's another copy and then there's another copy out to the right log where like three or four copies in other allocations and I'm not have enough time to talk about how that bad is um that is hugely non-deterministic you can do audit logs across networks you can do audit logs all across the places to diss the stack depth here is almost infinite it feels like um for embedded devices and if you don't have to for government regulations turn off auditing it's just going to be it's going to be a nightmare um and then we unlock the lock that was a lock that was held so any other read that would come through for that line discipline blocked on that lock which is not intuitively obvious other locks that are held you can we can Nest them and it's okay this lock actually makes the reading serialized um which it kind of killed some throughput and killed some workloads but again audit must not be be used that much I don't really know it's a hot it's a hot loock anyway and then the data goes user space and 64 chunks and we're done so that was dump um questions let's talk about the details all right that was details but that is a little glimpse into showing you all the nightmares that are involved why it's not deterministic why Thomas was right um it's a mess it's hugely complicated it's extraordinarily flexible it's very easy to write drivers for but it's extraordinarily flexible we can assign any device to any line discipline to any type of hardware and interact in multiple ways we can call from within the kernel we can call from sleeping contacts non-sleeping contacts we can have Loop x with inside we can have Echo characters we can have different line terminal um protocols that can inject other data along the way and do checksums and bounding things up it's extraordinarily flexible and it's why Linux has succeeded but it is crazy um way way many ways to sleep and the last one seems to not really be very obvious to most people outside of probably this room I want to spread the word uarts are a mess they're complex they're horrible beasts and they're extraordinarily dumb we have to write a lot of code to handle this stuff um really simple Arts are great really simple you are are rare um and people keep as I said earlier people keep making new ones why do you keep making a new one all the time I do not want to know you have to come up with all the code you have to do those 36 callbacks make it all work it's a mess please please stop doing that we got Arts working don't add new ones so that's the bad the good it's fast it's very very fast we support everything it's very very flexible and why we have actually succeeded so that's a bad and that's a good but it doesn't really answer Thomas's question what are we going to do so how do we fix this so with the realtime subsystem the last remaining bits are the things that people who don't work with operating systems think are the simplest bits logging print K most complex piece of code in the kernel you can print a message from anywhere in the kernel while you're while you're oopsy while you're crashing from interrupt Concepts from bottom half from soft Q from normal user space logging is hard tying that into the console subsystem which is tied into the TyTy subsystem is an even bigger Beast it all works it's very good but that's one of the last remaining non hugely non-deterministic pieces of code in the kernel adding the patches to the realtime sub or patch set to get this to work right is the final piece of the puzzle and we talked about this last year at the maintainer or the plumbers conference and um Linux Weekly News a great summer of it we're starting to see the patches flow out today on the mailing list I accepted some other ones to isolate some art locking into simple functions so that we can break the locks we're going to like kind of smash through the console subsystem if we're oopsy and we're crashing and we want to do an nmi we can also print K from an nmi um and fix the real-time problem so the way real time is going to do this it's pretty much just a punch straight through the layer and that's fine I'm okay with that you need the flexibility but we also need to add additional flexibility to break it it's kind of realistic but anyway that'll work never never ever call a art or a TTY device from a non-real time user space test or no call it from a non-real time don't call it from a real time don't call it when you need determinism um you're just playing with fire especially if you are playing with lasers lots of devices we can do laser welding robots right at 3 meters a second for a past decade um don't do it through a Serial Port do it through something that you know is deterministic know is repeatable no that will work the serial sub TTY subsystem is not that Beast don't do that um so along those lines also don't enable auditing if you have to care about real time and throughput because auditing adds a huge another layer of non-determinism not only to the TTY layer to all other Port portions of the kernel um if you look at there are ways to dis able auditing at runtime but even if you disable auditing those locks are still there and a lot of those locks are still there and a lot of the um mem copies and the buffer allocations are there but just when it gets down to a lower label the auditing subsystem it just returns quicker um still you just added some extra additional co- pass you blew through a few cach lines and you're back you need to go back again every function call in the kernel now has additional overhead thanks to Spectre meltdown so watch out with that and then the best way to fix it it just don't use it um why do you want to use the TT L for an embedded device you usually just want to get some data out for debugging right you want to see the printk messages you want to see something nice and simple um but you have to enable this 65k hunk of a monster on how to do this stuff um Nicholas Petri did a great job of trying to tame that beast and try and slim it down I think he ended up with is it like 25k I he got it he got it down but about halfway with like no functionality um Thomas has proposed a different way to do this and I totally agree with him he does he talks about it at the talk a couple mons back let's just have a new character device simple ring buffer no line control just want to spit data out spit data in you can make it deterministic make it simple you make it fast just do that for one type of device maybe we can plug in a few uarts at the bottom if you really care but once you start talking uarts start doing line changes then we're back to the mess we had before but if you really want it let's do something like this um you can search for this file it's not out there I'm hoping somebody will actually send it um if you care and you do want determinism in the TTY subsystem and you just want some de debug data or you want to talk to a device at a fixed line speed what not I'm very willing to take patches to do that um I tried to resuscitate Nicholas's patches um they don't really work anymore and he was trying to subvert it at a different level with no line disciplines and they like that and that was we think we should just not even worry about that at all Thomas goes into more detail about it in his talk um at that other conference from the lwn network and I totally agree with him I'll do that and then yui has been cleaning up the TTY layer a lot trying to make things a little simpler to use um we'll definitely start getting down to dropping a few of these locks we have too many locks we have still a big kernel lock in the TTY layer one of those mutexes one of those four that we use um we need to start breaking that down a little bit more um I don't think we can ever get rid of all the determinism just by virtue of the way and the flexibility we have to have but we can get better and we're willing to take patches there's a few people starting to do this work and I'm very happy to see this so best way to do it just leave it alone thank you [Applause] so these are the mascots from the colonel recipes um maybe next year you guys can have you guys will have a few branded one as well yeah uh you said there were Arts that had simple drivers so uh do you can you promote those simple art drivers and then like just have a list of line like line count and uh complexity of those drivers so that like you know vendors know what to use in their upcoming silicone um I I can do that yeah the problem is our our 8250 driver has grown to be such a beast because we support everything in it and it's what not and you can we talk about ways of I think if you we SP it up in such ways that if you disable a whole bunch of options they can get a lot tinier but you still have a bunch of complexity in there um there's not always the biggest thing is you're not following the Standard specs you do a little Quirk if you have to do a little Quirk you need another little function you need a function call back and just don't do any quirks just do a bog standard 8250 driver please uh so a followup to that would be what's the like your ideal driver like this is the template you should copy because it works so well um I'll point at the USB layer uh the USB layer did a really simple driver for USB serial to start with it's a generic it's called the generic USB cial driver it can send data in talk to out USB port send it back out so then if you replace you take that infrastructure and replace that send a data out and receive a data and don't care about line speeds and line changes and whatnot the logic there is very tiny and slim and I think we're down five 6K Max um and you can just replace that with a real chip on the other end I hoping somebody would do that for a real like the raw Ur driver making something like that but um talk to me offline yeah that's a good question I to look that up and then you can put that on a board right yes we'll talk how does Sev fit into this does it also need all these layers or I'm sorry SF there are you can have uh Sev devices and they bind directly to the serial port and yeah you can like you have a co-processor or something do they also have to deal with all of their ttii locking and such yes they do but they go through um depends on where that hooking up happens is that through the seev yeah seev yeah um layer it it does have its own line discipline so it intercepts really early at the TTY layer and it kind of does some Loop backs within the kernel so Bluetooth uses bluetoo TTY uses that and other ones but you still have a lot of those line disciplines depends on what line discipline you assign to it well it's kind of its own um it's not as bad but it's not perfect it's not deterministic um Rob's done a really good job to make it easy to use not necessarily um fast because these line rates are pretty slow overall these days um I could analyze that but it's not not that bad but not that good sorry I don't really have a good answer okay thank you cool um I wanted to ask about this uh right dma path so like if you take a DD and pump it into def ttys whatever right directly um there is a mem copy in that path before it hits the uart there's like three mem copies so like there's no chance to do high performance dma into TTY s whatever um I mean if if the like uh dty serial driver like picks up the buffers using a dma there will be a lot of M Copy before the dma right yeah so the performance will just suck the performance will suck um to a point the colonels copy data really really fast to do a DM set up the set up for a dma buff and switch all that stuff and then enable a dma engine is a very expensive rate we do have URS to do dma and that's good for this stupid little end polling where we like character character character we can do a big chunk character and then return and we'll go on and we know that the data is going to make it right sometimes we look at it but usually we just hope it pray it gets there that saves that portion of the CPU burn it does not save your throughput um but to be fair even at line rates of really High B rates copy data is fast but it is very non-deterministic to set up a dma buffer in user space and pass it down I've never heard I've never seen that request and is that like the dma is always picking a b of data or is the buffer somehow bigger eventually that's up to the driver okay thanks I don't know there I different drivers do it in different ways you look at the some of the drivers that are PCI cards they have multiple uarts on them and they talk some other protocol with dma buffers and mailboxes and fun they look like a scuzzy device um they do things in a different way than 8250 dma does but a250 dma is pretty standardized these days so it's not that hard it's actually split out to its own file so you can look at it but all those mem copies happen first hello thanks for your presentation uh my question is do you think that the pH functions to configure the TTR was part of the problem with adding a lot of flexibility for legacy and unuser features so there is no relationship yeah PX well posx just standardized the API in a way that everybody was already doing up to that I mean RMS is the guy who started PX um so you can blame him for that um but it was a good idea so the flexibility of TTY I mean TT have again been around since the early CPUs and the early processors multi so you have to have a way to do job control and to do signal handling and to handle all these different types of Hardwares and posx rules require that and it's good because then we can have a console and we can switch a console and we can have a Serial we can have do our SSH we can have do with all these slip and all these other functionalities this is good this is good functionality that we need and want um so we've done it all let's leave it alone but if you want to circumvent it all maybe we should also add something else but um posx rules are here to stay we need to always support Legacy stuff uh you really want to run those old binaries you really want to run a BBS all that stuff still in use today uh again German tra train network runs on ISDN which is over the serial layer um we still need those requirements so it is complex it is messy but it's a non-trivial functionality pseudo TT and real TT and we got did get we trimmed out a few we don't have the control node TTY I got rid of those about 20 years ago finally BSD still has those but we got rid of those and those might not be fully posit compliant but nobody seem to notice so we're okay else well I'm right on time and it's lunchtime so thank you very [Music] much
Info
Channel: BayLibre
Views: 2,394
Rating: undefined out of 5
Keywords: Embedded Recipes, Linux, Baylibre, TTY, Linux Foundation, Open Source Software, OSS, FOSS
Id: g4sZUBS57OQ
Channel Id: undefined
Length: 53min 24sec (3204 seconds)
Published: Thu Nov 02 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.