John Carmack Tech Talk with UMKC-SCE

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

Gotta find a good vein. I take my Carmack intravenously

👍︎︎ 20 👤︎︎ u/Toby1993 📅︎︎ May 30 2017 🗫︎ replies

Thanks for posting this! Really interesting and crazy to see how things started to form what VR is today. Problem solving at it's finest.

👍︎︎ 7 👤︎︎ u/Snake1029 📅︎︎ May 30 2017 🗫︎ replies

John Carmack could talk about making perfect bubbles in a milkshake and I'd want to hear it all.

👍︎︎ 8 👤︎︎ u/Serpher 📅︎︎ May 31 2017 🗫︎ replies

Seems Carmack has been focusing on video lately. Looking forward to see the fruits of this work.

👍︎︎ 5 👤︎︎ u/inter4ever 📅︎︎ May 30 2017 🗫︎ replies

Interesting video. So according to Carmack, there's still too much latency in all the existing eye tracking solutions. He says eye tracking is critical so it must be a big focus at Oculus. Sounds like there's still a lot of work to do.

It was also very interesting to hear him describe view based rendering of 360 video as a failed experiment. That there is too much latency in the current video compression technologies to be able to compensate for swift head turns when watching 360 video in a HMD. It appears the Facebook 360 video team have pushed on to look at other possible performance gains with 360 video rendering performance.

Also he talks a lot about assessing the value in certain enhancements and looking at the value curve but then later describes a lot of the video enhancements he is working on that seem quite marginal in value. Sure some of the enhancements look to be of great value but others seem quite low value.

👍︎︎ 3 👤︎︎ u/morfanis 📅︎︎ May 31 2017 🗫︎ replies

Wow this covered a lot of interesting questions about VR technology.

👍︎︎ 3 👤︎︎ u/rambosoy 📅︎︎ May 30 2017 🗫︎ replies

John Carmack always manages to capture my attention.

👍︎︎ 3 👤︎︎ u/InsidiousBoot 📅︎︎ May 30 2017 🗫︎ replies

When did John Carmack become the new star of Cooking With Dog?

👍︎︎ 1 👤︎︎ u/bluexy 📅︎︎ May 31 2017 🗫︎ replies
Captions
[Music] okay so I have done a lot of engineering work across computer graphics and gaming of aerospace and virtual reality what I'm going to talk about today is systems engineering which is looking across all the different pieces that make up the elements that are part of a problem and seeing how you can cut out pieces and optimize the different areas because this is something that I use across all of my real successes have had elements of this where if you divide a problem up the way things are done nowadays everybody gets their little specialization you have the person that's going to work to make the best this and the person that's going to work to make the best that and they're going to be coupled together in some way with an interface but in many cases they're built in inefficiencies from that where if person a doesn't know that something's different in person B they're just working through the interface that's all you get so this has given us marvelous things we have the ability to get you know just kind of pull levers and have amazing things happen across a you know different different aspects of engineering but knowing what's actually going on in the different areas can allow you to sidestep a lot of things and get greater efficiency like in one of the rocket ships that we built one of the things was there's a lunar lander simulator where you have a rocket that just moves over it flies down and lands in another place one of the major aspects to that is landing gear you know the lunar lander program spent immense amounts of effort building the landing gear for the real lunar lander and you would have teams of people that work and say I'm going to build the best damn landing gear the world has ever seen this is my task and I'm going to be an excellent engineer do a great job but one of the things that we did on one of our vehicles was say well what if we just didn't have any landing gear at all we've got pressure fed propellant tanks what if we just put a little bit of rubber on there just let it kind of FUD down on the ground and it turns out that actually works just fine you had this entire section of engineering which was do the landing gear and instead of getting it better and better it just completely vanished so that's one example of kind of looking at the entire problem if we're trying do this lunar lander simulation or analog I when we look at a lot more computer science type things one of my canonical examples now is the virtual reality head-mounted display and this was a really interesting trip through all the systems that go on making this up we're a traditional computer game you basically you can think about it you've got your computer and you've got some input going in maybe a game controller here and then you've got your output going to a monitor that's what happens in a game at your classic computer science system where you've got inputs you've got processing you've got outputs and this is fundamentally what video games and everything that you do with computer does so it's easy to see how people thought that if you want to take a virtual reality headset you have the same computer and you plug into a headset over here that's got some things to look through over there strap on the back and you have a sensor coming back that's telling you which direction it's pointing and then it's basically taking the place of the monitor and it looks like you basically have just jammed two peripherals together into a new device and that should be able to do what you want but when you do that initially it's really disappointing because what turns out to be all right in terms of responsiveness for traditional gaming is really lacking when you wind up putting it directly on your head one of the factors that's been here that's an issue for almost all interactive things is the latency where it doesn't get tested nearly as much as bandwidth the numbers that that people like to count in advertising but the delay between the time something happens like you mash the button on your joystick here it goes into the computer and the time you actually see something changing on the screen the difference from motion of say tapping something to the time that actual photons come off of the screen and head towards your face in your eye and are processed now in the old days if you go back to very old gaining systems like an Atari 2600 or something these were extremely low latency under 16 milliseconds where in many cases you're something on your controller and the raster coming off of the CRT is changing potentially even on the same line fractions of a millisecond after what you've when you've done something but modern computers have an enormous amount of buffers and subsystems between what's happening here and what actually comes out there in fact a modern joypad has a more powerful microcontroller than what an old video game system would be like an Atari 2600 you have more computing power and much more processing going on just in here than what used to be the entire system so as with any good engineering effort when you want to improve something the first thing you have to do is measure it you have to decide what you want to measure find out what your process is for measuring it and then collect some data so one of the things that even before virtual reality time I am there's one metric that was important for gaming was ping time the latency on a networked game that's nice because it's a latency that the computer can measure itself if how long do you send a packet out you mark time send a packet to somewhere when it comes back in you record the time look at the Delta you can say this is the latency of my network communication that's nice and easy makes for easy optimization of the different things even though it's a very complex system with physical things in the real world it's a lot harder because the computer by itself can't tell you exactly how long when something happened all it knows is when it finally came in from the system there and it knows when it pushed something maybe out to the screen but probably even there further back up so what I would be doing in physical gaming kind of this end-to-end testing is I would take a high-speed video camera nowadays everybody's phone can record 120 or 240 Hertz and that's a great thing but I know five ten years ago you had to buy a specialized camera that could record at the higher frame rates but I would set up all these cases where I've got the camera recording I would have the joypad sitting there and a TV screen there get them both in frame you mash the button with a really sharp info impulse and then wait till something comes on the screen pull the video off of the camera and then step through it one frame at a time you know like all right on free 27 the button is pushed down and even there it gets a little bit mushy as you're looking at is the button first contact of a button or the time it's fully deform down so there's some skew if you're trying to be really accurate about which four milliseconds it's on but you can certainly tell within 10 milliseconds or so very easily and then you count frames off a single step single step single step until you wind up seeing something coming out on the screen I conventional video games for a number of reasons have come taken some pretty far step pretty large steps backwards in responsiveness and this usually gets driven by you want to make I gets driven by graphics I mean a lot of blame can be laid at my feet for starting a lot of these trends I on what people would optimize for but when you optimize for the image that comes out of the screen you're optimizing for in many ways bandwidth and all that goes on in the CPU and the graphics processor and all of this stacks up more and more latency to the point where if you take a modern Xbox game high or Playstation not to be partisan about any of that but a modern console game plugged into a typical users television set that might be in your living room and you do this test it is very common to have over a hundred milliseconds of latency from the time that you mash a button to the time when something actually changes on the screen now with a gamepad for a console that's not that terrible people that are connoisseurs of gaming or that are competitive gamers will care about all of these small amounts but especially people that are playing a third-person game where you're controlling the guy running around down on the screen you lose more in the animation blending between how the character moves around than this and it's hidden fairly well if you get into more competitive types of games like first-person game and you're using a mouse then it it's much more of a problem but you can make sort of a graph of the value of this response time where if you have time down here and arbitrary scale of value here if you have something like a gamepad you might say the value they 100 milliseconds you've got I this is okay is here and it goes up and it gets more valuable but there's a certain point where game pads in many ways are terrible because you're integrating the position of your thumb over time this is not a great human user interface way to do things it's very imprecise it's amazing to look at what the best players can do with this but it's fundamentally not a great user interface if you look at something like a mouse when you're controlling directly your point of view with that that's a direct one-to-one I am correlation and that has a lot more value where that can go up higher probably starts lower when you've got a lot of lag laggy Mouse feels worse than a laggy game controller but it can go up and then it can get to be significantly better with with much lower times and you can get down to tens of milliseconds being a really useful thing now with a head mounted display when you've got it on your head the goal is to make it feel like you're in a virtual world like you are just looking around and the world is solid and it's just there if you have an appreciable amount of latency you can still say it's like oh it responds to the way my head is turning you know I can turn my head and then it comes over and it looks here and there's a little bit of value to that but there's no way you mistake this for saying that I feel like I'm reeling there and this is the way virtual reality displays really were for for a couple decades where you would see people write articles that talk about how it's going to change everything and it puts you in a different place but so much of that was really garbage where if you try these actual displays you would say I can imagine what this would be like if it was actually good but that's not what I'm actually experiencing right now and that sense that your control you're using your head as a controller basically and that's not great and it can make you really sick quite quickly you know when you're when the world does not respond the way your brain expects it to your brain tends to you know one of the theories is your brain thinks you've eaten something poisonous and it's making you see things incorrectly and you should therefore throw up and purge it out of your system and this is one of the problems with head mounted display is that if you're synthetic view of the world is far enough from the way you're the rest of your body specifically your vestibular system thinks it should be then it can cause you to be sick now it turns out that the value then there's a number around maybe 20 milliseconds it's about what we've decided is the part where it feels pretty much like what it should be like what reality there are still things at 20 milliseconds there's a few tests you can do like if you start rattling a head-mounted display on your head I know in certain ways you can tell the difference between 20 and 15 or 10 milliseconds but you have to be really pretty precise about it even people that are inside the industry I could a be test and the difference between 15 and 20 very few of them would be able to to tell the difference so there is a point that's of high value highest but it does taper off fairly fairly well from that but it's a steep thing somewhere around between 20 and 30 milliseconds where if you're at 50 milliseconds or more like that it's just not a good head mounted display so the problem the challenge then becomes how do we take something that might be running at 100 milliseconds and get it all the way down to 20 milliseconds you know what are all of the things that contribute and this is an extremely complicated system there are so many pieces that are moving I am some physically like the actual head mounted display but just in terms of systems that go through all of this you've got this simple model here's the computer I'm plugging in a sensor I'm pulling the video system out of here you know where is all the complexity and it breaks down into a whole lot of things so if you start at just the sensor I am this was something that surprised me when I first got into aerospace I thought well I'm going to build a guiding a guided rocket the rocket needs to know which way it's pointing coming into it as a naive software only person that time I thought well surely there's just a sensor you buy that tells you which direction things are pointing that this has to be a solved problem I was really kind of surprised at how it's not that simple there's no sensor without trade-offs that gives you what you would want for carrying a vehicle around and virtual reality tracks many of these things where you can use optical systems with cameras to track things you have to worry about occlusions and differing precision the further you get from it I am various other things like that but the most basic system that all virtual reality devices have now is an inertial measurement unit and this is very similar we use IMU they're called in the rocket ships that I built as well and in fact when I first started working on virtual reality I literally went and I took code from my rocket ship and that does the integration and I copied this over to the PC project and say well this does integration in a good way because this basic sensor you take it call the cheap ones are called MEMS micromachined sensors an inertial measurement unit so the first virtual reality headset to start working with it's got a MEMS sensor there and it's got a library you can talk to this and it will give you back and you can say what's my my orientation like my body frame and it will happily give you a number back but it turns out that just that sensor alone at that point even if you measured nothing else was nearly a hundred milliseconds of latency so just that box that you'd hope is one part of this long chain that ends in the display even that part was five times more than what needed to be there for achieving a really good experience so if you look inside that Memphis IMU there's actually potentially three different or more different processes there's the actual sensor then there's a microcontroller and then I am actually even the sensor has a little processing device on so the sensor itself these are these tiny etched ions sensors that vibrated they can vibrated very high rates and it might be outputting at 50 kilohertz so you might have just a couple microseconds of time between the times that it could be updating so how does it go from being a few microseconds of time they're down to 100 milliseconds by the time it's gotten to the computer so this is where you start getting into buffers and filtering and in this entire chain of things that we're going to go through there are buffers everywhere some of them don't matter much but some of them really pile up into very large values I it was my belief looking at this somewhat from the outside as black boxes that there was an immense amount of buffering and filtering that went on in here largely based from when these sensors first came out a decade prior they really weren't very good so they're very noisy and when you have a sensor with a lot of noise to get a clean signal out of it you just average together a hundred or a thousand values and you can get you know you get better precision by the square root of the number of samples that that you wind up putting together so you can take a crummy sensor and filter it enough to get a pretty good value out of it calibrations a whole nother large topics that can be gone into but at some point inside there they were buffering up a whole lot of it and filtering it to get a number out from there I it would go into a buffer queue here where this little thing spits out does some filtering on its own it spits things out this micro controller keeps another buffer queue that it's looking at and then on a certain cadence it communicates over USB now the USB protocol is big all by itself it's what you connect your keyboards and mice and so on with but a USB driver is actually hundreds of K F code it's a significant amount and there's lots of different modes that it can operate in there are I normally and this is where I the history of the world around it affects some of the systems choices where a better way to correct controllers like this in theory over USB would be what's called isochronous mode where i you treat it like a microphone where it's continuously streaming it comes out with very low latency but instead these are all hooked up through what's called human interface devices or hid devices and the reason for that has nothing to do with technical choices it has to do with the fact that you can make a generic hid device and not need to get a driver signed by Microsoft to install it on your operating system so here we are with something there might be a better way to do this but for reasons of convenience and technical inertia these are instead set out essentially pretending it's a mouse to some degree now the rate that that goes out is a configurable parameter you take your basic USB device like a keyboard and it's usually 125 Hertz so 125 updates a second you know eight ish milliseconds I per update but when you've got an update rate that comes out in six times like this you've got eight milliseconds or something between that if you have an impulse like somebody whacks the side of the sensor just sharply and it comes out right here so you've got a motion event that happens there it might be getting processed in here but if not going to be sent out anything else until it gets to this night-night next time period so you have a variable in fluctuating amount of latency there so that can add up to eight milliseconds right there even if this was the filtering completely vanished so you have that instantly going out you could still have eight milliseconds of time or zero to eight milliseconds depending on when an impulse actually happened before it winds up making its way to the computer so inside the computer the pcs name that of course doesn't go directly to the application it's going to go into an operating system buffer and the application might be an ion you know a blocking read mode or a polled mode and there's going to be some number of milliseconds this turns out this can vary a lot by the operating system you can get hard real-time operating systems that can respond to external events within a few microseconds but on Windows usually it's a couple of milliseconds it's not too bad that's not one of the really major contributors to this latency then you have an opportunity for your application to to read the event and do something with it we come back to Cadence's again here now a typical console game say today will run at unfortunately 30 frames per second sort of the higher class of games that care more about latency and responsiveness would run at 60 frames per second but you might have even if you take the optimistic point and say 60 frames per second so you've got this i $150 and there's also a delay just transmitting across the wire if it's a significant amount of ions of information these are relatively small so that's not really significant but if you get into things like video streaming where the bandwidth time can then equal your frame time as a whole nother rate level of inefficiency that goes or delay at least that goes in there so your event happens here it gets sent out over USB here it takes a little while to transfer across the USB a little while to get through the driver stack and then eventually the game gets around to its next frame and read the influences all right what's actually happened so already we've piled up 16 20 milliseconds of delay here disregarding the filtering then you think well the game is going to do something and draw draw an image and it's going to go out the video system but most games today are multi-threaded is to increase what they can do so you can have more things going on you can have I instead of 10 enemies you can have 20 or 30 enemies fighting on the screen at the same time so there's a pipeline where you might have CPU one read the input does things but you might have CPU to doing the drawing and and these are pipelines so they're all going on in parallel so you can sort of be rent be creating two frames at one time most games are set up at least five line with too often we have more processors going on but not in a strict pipeline but they've switched the separation between game and rendering logic is actually quite common so it might be your input comes in here but instead of actually being done at the end of that frame which generously again 16 milliseconds pessimistically 30 or more it then turns into a buffer of commands they get forwarded down to the next CPU which runs for another frame and then instead of being done here at this point it's actually often handed off to the GPU so now you've got your graphics accelerator which if a game is maxing everything out and taking full advantage of the system is going to be carefully tuned to use almost the entire frame for its actual processing so it then goes through all the graphics processor but even then it's not immediately displayed we come to buffers again there are buffers in each of these places usually this is fixed to only be to not be able to buffer things up if something's architected poorly it might be but the the amounts of the graphics processor can soak up can sometimes be very large and this gets influenced by things like benchmarking trends I a benchmark is almost always reported as frames per second I know how fast this could render flat out they're not paying any attention to latency at all so it's in a graphics vendors best interest to have enormous buffers because if a buffer ever runs dry then the graphics processor says I have nothing to do I'm just going to idle and hardware vendors hate that you know they want their systems being ought to be occupied all the time even if it means winding up with with latency that you don't want there so this can sometimes be much worse but if we're assuming that you've gone to the trouble of there's ways you could force the graphics processor to not buffer anymore up but you can still have a whole nother frame there but even there it's not ready to go out because then it's put into the swamp chain to be presented to the video driver to be presented to the actual screen and that's setup I might be double buffered or triple buffered and there's reasons why you might do that to make things smoother but they all wind up harming latency so you might have multiple additional frames where it's done but it's sitting in a queue waiting to be displayed and then finally when it's actually ready to go out the back of your computer to the monitor over the wires going up there it starts scanning out not all at once but it scans out one line at a time you've got your screen and just like CRTs from 50 years ago it starts at the top it scans the lines out one at a time this does mean this is not obvious to people initially necessarily but the top of your screen comes out 15 16 milliseconds earlier than the bottom of your screen in the old days when you had CRTs you could actually like you had a slow-motion camera or you would see the beep frequencies of the strobing that happened when you were recording it I have one close frequency and it was displaying it the other one but if you take a slow-motion camera you could actually see it drawing out and lighting up as it goes and then fading out as it would go behind it with LCDs today it's continuously illuminated but if you take a slow-motion video of it you will still see them changing as they go down the line and this is actually I limited in modern systems at modern resolutions by the bandwidth of our cables that we wind up pushing across them they limit how fast you can refresh the screen how fast you can push the size the pixels out so if you total all of this up this is how you get to this 100 milliseconds of time for typical console game and then you could have another 100 milliseconds from this sensor kind of just being poorly configured at the beginning I the challenge event is to say how do we get this responsiveness down to 20 milliseconds so basically cut it by an order of magnitude from a bad case to what we want in our good case so the typical iya kind of discrete way you could look at it say well what could we do in each section of this to make it better I you know you can put in the hard GPU sync there so that it doesn't buffer up too much you can disable triple bond bring a cup one frame off of that you could just say well we're going to do a simpler game that doesn't need this pipelining so it can be just one frame you say well if if you also make the graphics simpler if you're viewing a game that was five years old or something that doesn't need the power of a modern system then you can jam those together you can have the CPU rendering to the graphics processor in one frame I but it seems really harsh when you start looking at if it takes 16 milliseconds just to scan out the screen how do you get to 20 milliseconds for the bottom of your screen at least where it seems to imply you've only got 4 milliseconds to do everything else and so this is where you start trying to find creative ways to cheat you know to find out ways to to bypass these aiyah these things and still accomplish your goal without necessarily going through everything the conventional way and most of what I what's been possible for virtual reality systems is bypassing sensor inputs much later so we have this long pipeline that you want to go through some of the key tricks are figuring out how to bring updated sensor information in after all of this and still get mostly what you want and now as with all really good engineering trades it is a trade it's a compromise you're doing something that is not perfect it is not as good as if we just had a computer that was 10 times faster we would just feel to do exactly what we've got we'd get in our the realm that we wanted our responsiveness and life would be wonderful but when you can't wait around for computers to just get 10 times faster sometimes you need to figure out how to make the really smart trade-off and when you looked at the kind of the value curves there you can say there's a really steep value curve here I really want to get I'm willing to sacrifice some things to bring this value curve up from this low area up to this high area I maybe don't care as much about that final tail off at the top but it's really valuable to kind of get up there and you can have kind of bitter arguments about priorities when it comes down to things like this because for anything that you want to cut you'll be able to find somebody that thinks it's the most critical important piece of the project and and it's just you know nothing anything but that I am but so the the past is cleaning this up for virtual reality was the most obvious thing was fixing the the microcontroller where it was doing eyes where it was doing all of this filtering to make its body frame stuff and this is where like I took my aerospace code and said forget the the body frame stuff give me the raw sensor values and talking to the company and saying well this 125 for its update that's okay for most things but can you do better and they just said sure 250 is easy maybe 500 sometimes it is if you take it as a given if you take it as just your the software guy writing the computer code here your peripherals are just what's given to you but sometimes it's trivial for a peripheral vendor to be able to say oh I didn't know that was important I that's just I'm just changing one character in a configuration file and I can give you double the update right there so like you can pick up so then that eight milliseconds gets cut in half to for all the filtering in there basically goes away hi so that's that's still looking at okay maybe all told you're at six milliseconds or something kind of coming into the sensor when it gets into the PC so you still don't see how all the rest of that stuff crushes down one of the other major tricks I'm hard to do on the PC but like on the mobile systems that we do is for decades now people are used to this presenting any image I on to the graphics system you say I finished my frame pass it on to the video display start scanning it out but in the in the old days graphics programmers would I would do what we called racing the beam where the screen is scanning out in this way and if you know what you're doing if you know we're there aster is in memory you can be drawing things kind of right ahead of it so it's possible to actually get a millisecond response time if you know exactly where it's reading for memory and you are very carefully rendering right ahead of it so conventionally now I like in mobile our screens are actually they scan sideways your phone I if they considered a portrait display rather than a screen that's Gran's in landscape mode it scans from left to right like this so we split it into two pieces and you've got the two different views and we make sure we draw this view while it's scanning this one out and vice versa so now you don't have 16 milliseconds you've got 8 milliseconds there that's a big cut off of it or you still look at your game and say even with a simple game this is going to be 16 something milliseconds even if you're doing all of the computing and all the graphics inside one frame the next trick was to do a reprojection and this is where you're coming into some of the actual compromises when you've rendered an image it's a perspective image from one point of view what you'd like to be able to do is say I just spent 40 milliseconds from the time I rendered the input to generate this image I've got an image it's perfect but perfect for the time 40 milliseconds ago instead you can take that image and say well I can look at the sensor again now instead of just reading the sensor at the beginning let's arrange to read the sensor or like right down here a couple milliseconds before you would be ready to scan it out and you get that information back and it says well actually we're pointing this way you know you started off pointing this way and maybe you predicted and you guessed that we'd be over here but actually we're over here given that information it's possible to take your whatever nice picture you've rendered here and distort it and warp it in a degree way if you know that I should have been looking a little bit more that way so you move it over and it's perspective distort it a little bit that way it's not perfect where if you really moved a lot and you guess really badly you might not have anything over here it might just be black because you want to limit how much you render maybe you render a little bit more if you've got a lot of graphics power you could guess and I you try to make the best guess you can but you fix it up here at this late time and this is what I'd been calling Time Warp where at the very end we arranged to after we've drawn everything read the sensors again and figure out how to distort the image that we actually drew so at this point it's possible to be here under 20 milliseconds I'm back on our mobile systems we're usually around 1618 milliseconds depending on various things I we cut that the inputs a couple milliseconds scheduling is a couple milliseconds I a few milliseconds for drawing it on the time-warp and then eight milliseconds for scanning out plus a little bit of cushion the trade-off though is that this might not always be perfect and in fact if it's only it's very close the prospective math works out so that some part of the image that you're looking at here can be almost pixel perfect for something that is an orientation change only but in virtual reality if you've got a positional change or there's animation going on that won't be exactly correct and we had bitter arguments internally about how important some of this was in the early days but the ability to to get that responsiveness down to a fairly important point of this 20 millisecond range was was really worthwhile to make these trade-offs but it's been interesting to say what's the you call the speed of light possibility here how low can you get this in the kind of the best case and there's things that we do that we've looked at like instead of taking the screen and splitting into two pieces what if we split it into eight pieces and schedule it just before each one you could then knock an eight millisecond delay down to two milliseconds cut six off this comes back to kind of these benefit curves I look at it where you've got very very vent bad value there it climbs really steeply and then it climbs very gently if at all and so kind of feel we're we're at this point where we can make experimental studies and set up a system and get it down to be very close here and it doesn't make a ton of difference there's the possibility that with some things that I am like if we move to ray tracing systems where we get adjust the position as well it might make more difference there but right now it feels like that's not a huge win and that's where some of the real an engineering wisdom about all this comes in about figuring out how to fight like hell to climb up the steep part of that curve and then knowing when it's not that important there when you can move on to to dealing with other things so this the same basic pipeline applies in a more problematic way for hand controllers where they have very similar capabilities you've got an IMU in the hand controller but it's going to have a wireless link going across it so we're dealing with Bluetooth Low Energy as a transport protocol right there like a Bluetooth stack can be half a megabyte of code it's an extremely complicated specification and there's lots of different ways that you could use it and one of the bad ways it gets used is again if people are looking at this from a specialists standpoint you say I'm going to do the best job possible filtering this IMU it's going to be the cleanest data but I'm just going to use a stream abstraction to talk to the CPU and you'll find a nice little library it's like here's how to do you know the equivalent of TCP over Bluetooth and it will just buffer it up and chop it into the necessary little transport packets and generally be a horrible situation so there's opportunities to be really broken there and then when you get to all of this you have to go to the same things but the same time warp trip can't be done for the hand controller because the hand controllers baked into the rest of the screen you can't just move that little piece of it you know how your head is moved I you can consider doing more tricks then and this gets into hairier trade-offs where you can say well what if instead of rendering your hand into the main scene we render the hand to a whole separate layer it's an image all by itself then we can do time warp tricks we can move it around that but then you have to worry about cases well what happens when you put your hand behind something in the world if you're just layering it on top then that's not going to work now you say well do we carry a depth buffer for it also and make our compositor do depth I'm comparisons as we put it together and it starts getting to be you know a big mess there's a few tricks that you can do like late latching you can't go all the way to the end and do a time warp but there are ways that you can set up so at least the GPU figures it out right here and it might have 14 milliseconds more latency than your head but it can still be a lot better but luckily the hand feedback it's important to get a sense of where your hand is but it's not as important as your head it's kind of like going down the threshold to Allisyn to a joypad where a hand controller is somewhere in that neighborhood rather than the exact responsiveness that your head expects when your brain your eyes are right there your brain expects when you look over the world is going to be a certain way when you're moving your hand around it's not as precise as people think it is there's actually been some cool research done where people can make your hand they'll use Virtual Reality and put things on a table make you reach for it and you could be reaching like a foot further than your arm actually goes but if they're drawing it correctly and you're looking at it you can fool yourself and I mean I had this done to me yet eyes like Texas A&M they had something set up and you're reaching for something and they're moving it around a lot as you're looking around but you don't it's not as critical as your head while if you pitch your head five degrees off I it will be immediately noticeable and will make you uncomfortable fairly quickly so another one of the things that I'm working on right now is the Indian system problem of virtual reality video where this is one of the most important uses for things right now but largely I think the quality that most people see when they see a 360 video or stereo VR video is pretty bad hi we had one of the things that I was really happy with when we first launched the gear VR product I we were able to include a a 16 gigabyte SD card with it so we included some pretty high quality footage with it but since then all of them have just been everybody sees streamed video and only you know only a small fraction that people will go to the trouble of downloading high-quality video and this pains me deeply when I'm you know knowing that people are seeing these very low quality video systems and I and a lot of mobile you won't even be able to see the 60 frames per second stereo where it really becomes kind of magical and you get to see the value from it but it's a hard problem because there are time I mean even the best case the side-loaded stuff that we had isn't at the quality level that we would like to have from it so this is again going back to the speed of light notions I start by saying what would be the best possible quality that we could deliver and the way we can test that is if we say all right we've got say this phone that has 1440 resolution display even video encoding and decoding that's gonna have quality losses of some level but we can at least make a small test cases where you've got a phone with four gigs of memory or something just load up four gigs of frames and switch between them 60 frames per second perfect no artifacts and you can look at that and say oh this is just spectacularly good this is so much better than you know then what people are actually seeing and that's where I kind of define opportunity as an engineer when you can tell that the difference between what's possible and what you're given or what people are seeing today you know the Delta there is the opportunity that you have for improving things so if you look at this entire pipeline this is an even longer pipeline than the input because it crosses people and organizations and the Internet as it's going through here where you have the way the content is created on one side I at one end now you've got a camera a camera generates a video file the video file goes into some kind of post-production thing to warp it from say it starts out as a fisheye it's going to be warped into an equal rectum so does some kind of warping operation it goes through whatever editing stuff they finally wind up putting it through then it goes into some kind of data compression ion system and then it's put up on the internet so internet server people connect to that it streamed over various protocols you have different versions for different bit rates you stream over TCP with error correction I eventually over the Internet it gets down to somebody's eye on you know somebody's VR system and then inside there we have that same route that we had before where it comes in somehow as an input it goes through processing it goes through rendering it winds up on the screen eventually now the problem is the pixels that come out of the display there are so far different from what went into the camera and this is where quality takes on a lot of different axes we have most people when they think about qua talk about quality and video everybody thinks first about resolution you know we got 4k or we're going to 8k or whatever we're resolution is kind of the the buzzword same thing with cameras and purists get really uptight about this or say you don't want to just look at the megapixels you want to look at the sensor size and they're all these other important factors there but still resolution is one of the things you look at then you know a little bit more knowledge you look at well what's the bitrate that I'm actually getting I'm coming across the internet and people would have paid attention to video for a while it's I it's surprising how little we make do with now where blu-ray discs were going to be died what was needed at the time for doing 1080p video Full HD video and blu-ray discs give you like 50 megabits a second coming off of them and that's what video people thought was necessary to do a good 1080p video now if you go stream 1080p video from Netflix or YouTube you're getting a couple megabits now our compression is a little bit better but it's not that much better than I you know than what we had then so really we're already used to settling a lot on video quality VR makes it much worse because when you're looking at your cell phone you're looking down there little compression artifacts and little losses of quality just don't make that much difference but when it's blown up so that one pixel is occupying 1/10 of a degree or so a visual arc those are pixels that mean something it's kind of like in the old video game days when we had 320 by 200 a pixel at a place really meant something you know a crack between triangles with something that just kind of sat there right on your monitor rather than being something that just gets lost in the anti-aliasing system I but video resolution is something that interesting things are happening where for a while there was the arms race with the phone manufacturers more and more resolution I am we got to the point where like all the phones that we do VR on are 1440 resolution and one company Sony went and shipped a 4k phone and it turned out nobody cared I and in fact it's turned out that even Samsung who has pushed all of our nice 1440 phones modern Samsung phones have an option to actually run every if it was only a 1080p phone because it turns out most users prefer the savings in battery life at thermal overhead for the 1440 because a lot of people just literally can't tell the difference so it saddened me greatly to find that we're not going to be getting lots of 4k phones for VR because that would have been great I've kind of get that for free but probably not happening I and really more resolution on the display device wouldn't help the video problem at all because the videos are coming in at much lower resolution already the things that can help it are obviously if we can get more bitrate I am so then you say well look at our transport stuff and I've profiled some of our our transport and found that in many cases we're not getting anywhere near what you should be like if you do a speed test you say I'm getting 20 megabits per second transfer but then you look at what is your video streaming actually delivering and like well there's only half of that why can't I get why can't I get more get a better bitrate just choose a better representation so that's one direction to go at but that's a rabbit hole all by itself what determines the streaming for instance in an oculus we use the exoplayer project for google which is a big project which is layered on top of lower level video codecs and has lots of plugins systems and all sorts of things that go on in there so what I've done to start addressing that is I started from the lowest level that I had access to as a user level programmer on Android and I wrote but I've been calling video direct to say if I look at only the pieces that deal with the decoding these hardware interfaces what if I just feed them myself let's take out this one hundred couple hundred thousand lines of code that maybe I could dig through and figure out how to optimize but carve it all away and start kind of from the bedrock and of course it's not really bedrock that's a driver interface that goes down to things which talk to Hardware below that which has tons of firmware in there so bedrock for an application programmer is really still the top of a tall stack of things that happen at the operating system and device driver level but still it's one of those things that a lot of important things can happen one of the things that it bugged me for years since we started on this in virtual reality comes down to another tempo issue where I pointed out how we've got these 60 frames per second hard switches I if you look at this and this will probably bug a lot of you forever after because once you've seen it you can't stop seeing it and can't unsee tempo problems in digital video but they're all over the place where if you get especially a 60 frames per second video something that should be going perfectly smooth it is shocking how many display devices will wind up having stutters and hitches in them and there's a lot of reasons that can cause this but it's a matter of video has timestamps in it and it says this number of microseconds here this number of microseconds here but the actual display that's coming out is at slightly different times now it used to be in the old days I TVs they really were 59.99 five I am they were very close to that but all the digital systems can drift more than you might think our phone displays are often fifty nine point two frames per second sixty point one in the best case this would lead to smooth smooth smooth and then once every second or two you would glitch one frame either because you had to drop one or you held one frame for you know for two frame time periods but unfortunately the way a lot of these systems are written with buffering and I and different time stamps there are cues you get into beat frequency problems where you have things that are coming out of this almost regular interval and it's almost regular interval here and we wind up with cases where when I graph what should be a nice sawtooth whereas going up like the latency on here and then drop down and then go up I'm like okay I don't love that that every every second or two it's got a little correction but instead we wind up with things that look like and there are a lot of systems that wind up doing this I was shocked at this year's I game developer Awards with they had you know this many million dollar display system back behind all the presenters and they have nice animated graphics going on and I'm watching it do that same damn thing smooth shutter George ever smooth shutter to shutter and you see this in a bunch of other things like TV sets that are trying to do I'm frame interpolation like your TV that wants to be a 240 Hertz television set I so it takes the frames in and tries to fabricate you know actually you can actually call it detail hallucination or to build in frames that weren't there and there's neat technology behind that but they often wind up having tempo problems here as they have to switch in and out of being able to you know to do that so one of the goals and this had been bugging me for a couple years but as I started digging into this this long pipeline I'm still tearing this apart literally like as yesterday this is one of the things that I'm working on but there's there are things to be trimmed in all areas of this but the first part on the video playback taking it instead of just saying video player play this bunch of data that I'm pushing at you taking control at the frame level and saying alright decode this frame it's a frame don't do anything with it yet I will control when it goes out so I wanted to fix this perfectly because I wanted it to be a straight line we're just every frame has exactly the same latency and I could do that initially where you take a frame and I say I'm ignoring the timestamp I'm just going to take the next frame I'm going to release it on the next frame so it's always going to be one video frame gets one media frame it's going to be perfectly smooth the problem is the audio drifts out of sync because the audio is on this true time and it's going off through this other completely separate thing and audio systems have their own equivalent set of buffers where it goes through three different mixers before it finally makes its way outside that so what you have to do to get to this perfectly flat line instead of the messes there is go one for one on video or two one for two and 30 or 3-4 times for two on 24 I and also take the audio and you have to slightly resample all of the audio by up you know half a percent or something so that instead of being at exactly forty four point one or 48 kilohertz it's going to be one percent off and people musicians would probably era probably some musicians that will wind up hearing this and say hey you're some fraction of a semitone off on your your musical representation but it turns out you know for most people it winds up working great and I can have this smooth as glass video playback with the audio synced up to it and even do a better job with getting the lip-sync right than the standard system things do so I've taken kind of this part of it here from the playback part and fixed a bunch of things and this is this is working well but it's still not at all the quality that I want I've got the network interface side of things where I think that there's a factor of at least 50% by just optimizing that a lot better I am interestingly one of the directions that we've taken for the last year-plus and I was really big on this at the beginning it was this notion of view dependent transforms where video streaming the whole idea that normally sending a whole 360-degree view when you just spend most of your time looking forward that seems just such a huge waste it's like you're you're looking at one face of a cube and you know you really are only seeing one six but even if you gave yourself a lot of slop it felt like there was a factor of four in quality to be gotten there so we implemented all of this and the high end videos get it builds like 20 different versions and it picks the best one as you're looking at it but if you whip your head around you're seeing this very blurry version and if it fixed itself really quick that'd be great but sometimes you can be there looking for a couple seconds before it corrects and it's a little debatable whether that's that's been a great trade-off now part of that is it comes down to the same types of things with tempos and trade-offs where these aren't 16 millisecond blocks but in video compression you deal with what are called groups of pictures or GOP's where you might notice that unlike like an old analog system where you could theoretically go to any frame in a video most digital video there are discrete places where you can immediately sync to there are because it's set up the way video compression works as you send over one good independent frame and then the other frames reference that frame so you can't just jump into the middle because you need what it's referencing that's how you get hundred to one compression instead of ten to one that you get out of still images I so you've got this problem of video is chunked up into these cops and horrible trade-off is the bigger the goths the better your compression so if you say well I really want to get this down to very low bandwidth then you want to have large Goff slots of Delta frames very few I'm kind of intro friends but that then fights with this idea being able to switch directly between the different viewpoints because you've got a large you say you picked five seconds you've got a great compression ratio but then you turn around it could be five seconds best case for it to come in and because you've got these same input problems it could be even worse it's so easy to slip past the edge and have it wind up being ten seconds for some reason so instead you pick a small number like one second but then you have relatively poor compression and it's this trade-off of well I'm getting four times better resolution in theory ahead of me but my compression has gone down thirty percent and it's not that great on the backside and part of this is that the system and internally like we started off building a custom server that was going to be the minimum response time and for various reasons it wound up shipping on something that is more or less a stock server using stock in peg - protocols and I still in the back of my mind think that we abandon that too soon that we've got this again hundreds of thousands of lines of software code that are not doing exactly what we want and I want to carve all that away and build the raw server again and see if we can make it switch in the minimum amount of time but if we can't there's still like other things that we can do to make it work out better where you go further up the space all of the video productions right now in VR they start with conventional cameras maybe with wide-angle fisheye lenses and then they're deformed into a a projection usually what's called an equal rectangular projection which is kind of your Mercator projection sort of thing various there you can argue about projection efficiency in a lot of different ways about how you've got wasted pixels up at the poles but that's kind of the industry standard we use cube maps for some things that they have different trade-offs but an interesting thing or an important thing is that any time you do an operation on an image you're doing damage to it I mean you have people in knife you know electrical engineers might talk about signal processing in the Nyquist limit and in theory you can recover you know you signal from any of that but it's it's not really true in reality every time you resample an image especially if you wind up compressing it at the end you're damaging the image irreparably so every time you change the shape of something you are throwing away some quality now it's not as dominant as I do as like what bitrate you might be at but it does matter and this has been something that I've been very proud of in just the last couple months we got to fix one of these second-order effects like avoiding a double resampling of our user interface text and getting it into the right color space and these are their second order they're not as important as getting the resolution right but but they are helpful and there's a bunch of these in the video pipeline so one of the things that I'm working on this week is setting it up so in some cases you can take what comes out of the camera which is a fisheye lens which has kind of radial distortion where it's got a center I am and you look at it and it's a horribly distorted image but rather than in pre-processing turning that into one of the EE correct images it's ruled like that which takes a lot of distortion taking this image directly compressing it and moving it directly to the end and putting some smarts on the end there so that it's a graphics process or it can pretty much exactly as easily take something directly from a fisheye projection as it can from an equal protection and that allows us to get rid of at least one if not more than one projection there then you start getting into some really interesting details where the way you've got separate cameras here if you want to make stereo video systems you're going to have to have at least two views or view from each camera so you want your picture to wind up looking like fisheye view here fisheye view here and again it's interesting where that will look vaguely similar to what the VR view comes out of but they are different distortions and of course your VR view could be looking at any part of what these fish eye views are but even if you say alright I want to take the fisheye data directly from this camera and move it here there's things you can do we'll mess it up even I just in that transformation for one thing your camera is probably putting out h.264 video compression hopefully at a pretty generous rate your camera might be putting it out at 40 or 50 mega fan megabits per second which you still get compression artifacts in it I'm especially a things they're moving on but that's a far cry from what might be five or 10 megabits per second on the final part there but even there if you take the image and you've got this fisheye centered in a normal camera frame and just say well I want to grab this part and I want to move it over here take the other camera grab it and move over move it over there digging down about what actually happens the way the way compression both video and still image compression works is by chopping the image up into blocks usually like 8 or 16 pixels and you've all seen poorly compressed images where you can literally see these blocks blurry blocks with sharp edges between them so you've got a problem where if you say I want to take from OH pixel 511 to 1712 I'm just going to chop that out and put it here picking a number like 511 that's going to be slicing one of those given blocks taking it apart recompressing it you have to decompress it to crop it and then recompress it down and you're spreading the damage that's happened in the compression across multiple blocks so I'm looking to try to get this again the speed of light what would be the best possible thing that I could get here would be taking the output from the camera ideally not even decompressing it you know you could take two cameras and take the two camera streams and it would waste more but to find out what the peak quality would be potentially decode to complete camera streams so the pixels that come from the sensor there make it all the way and you just do one bilinear interpolation on the end but even that is hiding things so this is this onion of layers upon layers of things that go on if you look at like with a magnifying glass on a video screen normally like a desktop monitor you'll see it's broken up into triads usually usually laid out like that where you'll have a red pixel of green pixel and a blue pixel I and this is when you hear a monitor call like a 1080 1920 by 1080 it's actually 1920 times 3 horizontal resolution and this is sometimes used for tricks to make text rendering look better and this is this is exploited in some ways I'm the phone's that we use I and in fact see the OLED displays in the high of the PC headsets as well they actually cheat on this and this is something that a lot of people just hated this in the industry but the OLED displays use what's called a pentile layout where they have twice as many greens where usually they've got like a checkerboard pattern like this where you might have this many greens but then you would have reds and blues at half the rate so at the hardware level it behind your back as a programmer you say I've worked was you know really hard with the graphics processor and I've rendered this perfect image that's RGB you know I've got you know million Reds the million greens and a million blues the display is actually taking those million reds and just mashing them together into half a million actual display pixels you know which is unfortunate I on the the piece of high-end PC systems at the firmware level we can go in and take it out of that mode and take better control of directly putting it in but on the other side all the way over here on the camera cameras also have patterns that are very much like that they're called bayer patterns on the cameras but if you again look at you know under a microscope at what the camera sensors have and of course the camera sensors it's a similar thing where you might have a 12 megapixel camera this is the difference between cameras and the displays when a display talks about resolution they're talking about full pixels you know the full RGB s when a camera talks about pixels they're counting all of the sub pixels a 12 megapixel camera is 4,000 pixels by 3,000 pixels that sensors and inside those half of them are greens and same thing and then a quarter of the reds and blues divide up the inside if you take a raw camera image then you're not going through any of the processing you're reading out exactly what's in this sensor array and if you have like a rainbow color image and you take a raw camera image and you display it as a grayscale you can then pick out see exactly what the pattern is it's like okay these checkerboard is there are the greens at this offset this other pattern is the Reds of these are the blues so if we really wanted to get the best possible system instead of having this image processor like cameras and phones have this very powerful image processor system that tries to take this raw data and they do crazy stuff with it to try to make bad photographers look better you know they do all of this image balancing and sharpening and D distortion and the idea is not driven usually by science of trying to make it a better sensor it's usually driven by how can we make some magazine reviewers I have you know review of this phone camera look better so people will do the side by side photos and they just say well does this look better so they're doing crazy heuristic stuff in the image processor you know you'd really like to be able to get just that raw image in a video compression system they take they don't take red green and blue they convert it to Y UV which is an intensity color space because it's better for compression so each of these every change again is destruction of data so you wind up having even before it gets out of the camera it's been converted from RGB layer patterns into RGB into y UV and then compressed so a whole bunch of things have happened already in the best of all worlds we would notice that ok this camera sensor has twice as many greens as Reds and blue our display has twice as many greens as reds and blues what we'd really like to do is take I hijack the video encoder there and say forget this y UV conversion must have you compress greens as the main plane and then subsampled reds and blues let's take this all the way through to the end and then have the display knowing the offsets and these all things here that are different go ahead and sample directly from that image so this is one of my semi long-term plans we're looking at the entire system Indian and these are all by different companies I mean in the number there's ten companies involved in this entire pipeline of things going through but there's optimization potential to be had all throughout it and I believe that the best case that we could get by wringing all of the fat out of this without changing any of the hardware at all I think that there's no factors of several in the quality that we can deliver to the users now it's always great the things get faster and we get higher resolutions but we are hitting some of these limits we're alright the phone displays probably are not going to 4k let alone the eight or sixteen K that we would like to have in some ways you know we are running out of steam on kind of the Moore's Law free graphics power for things where you know we are settling now for a 30% boost each year in the power that we get out of these things rather than saying that it's going to no it's going to double every 18 months i bandwidth still going up at least a little bit but there's hard fundamental things to be worried about with spectrum and how much we could possibly get out of 5g and different things so these systems optimizations are still fertile ground for finding a lot of value so I'm you know I'm excited about this as one of my big eye kind of pushes for this year that the push on latency was the thing from a couple years ago we have all sorts of other initiatives going on but as you can tell I'm still super excited to be an engineer and be working in all of this we've already run over our time a little bit but I got time to take some questions we've got a while till we get chased out of the room you're talking about that kind of continual events that are happening for different companies how do you guys kind of incorporate that into your workflow it could be like co-created rendering that would potentially a huge savings to how much you have to get out of that EQ so associated rendering is for people so like how do we take in advances it might be happening from other companies like Toby ated rendering is the specific case of normally you've got a graphic screen you draw the entire thing at a fixed quality level but if you knew exactly where the user was looking like if you knew he was looking over here you could draw high-resolution there and just let it fade out to much lower resolution across the rest of the system something that most people aren't aware of the bandwidth of the optic nerve like what comes out of your eyes is actually a lot less than what your monitor or VR headset has right now the resolution that you have in your phobia the very areas you focus on is very high they talk about retina resolution displays and that does pretty much determine you can argue about some of the vernier I effects but when you take a you know a couple million pixels oh and you hold it at arm's length that's about the limit of your visual acuity but something over here that's still in your view you've got over 180 degree field of view but you've only got a couple degrees where you've got that acuity and over here things could be doing crazy things and you don't even notice like one of the examples I point out is I've got a dual 30-inch monitor system that I work on and I can be looking at this and I can see the other monitor but I can be waving my mouse cursor which is like 256 pixels and I can't find it you know I actually have to go and look at it because in my field of view 256 pixels could be changing and I'm not even noticing so there's enormous potential upside for doing this if we and it becomes mandatory if we look at where we want to be with virtual reality which really would be something like 16 K displays which we're just not going to get I am in though we could fabric it's interesting you could fabricate a 16 K display out of tiled silicon micro displays today but it would be very difficult to drive it with anything like the quality that we do right now because most high-end graphic systems still wind up being pixel bound but if you knew exactly where the user was looking then you could say all right I'm going to render a lot right there and much less and there's there's lots of different interesting schemes about exactly how you tell the graphics processor to do that my pet scheme for foliated rendering usually people talk about like rendering like one view here and a lower resolution view here but if you look at like Wikipedia for fovea I like the foveal region you'll see the graph it's kind of a hyperbolic shape like this for the quality that you need in the middle versus the quality on the outside and when I look at that I think that looks kind of like a hyperbolic asymptote and there's a way you can get that with conventional rendering with perspective math is if you wind up looking down like the corner of a cube so you have basically cube rendering like this you can get pretty similar full off shape to that and you can back the view up a little bit but there are there's a numb all ton of this work going on I in a lot of different companies about exactly how you wind up doing it but the question of interacting with the other companies in life like in facebook oculus is case the standard behavior is to to buy a company I'm and and that side bit it's you know i guess almost an open secret in that in a lot of hot technical areas if it's something that's really i you know on fire in the media and everybody's talking about it you know the smart move is to get together with your friends found the company and you know be acquired rather than necessarily just putting in your resume that's happened far more than I would care to you like we've had a hard time hiring computer vision people and I we've hired we've acquired a few companies for those people that I that work on that and we have hired and we've acquired an eye tracking company and we have people working on it in research I'm I mean I'm in a little bit of it know as a from my position I've always just been happy to see whatever anybody makes and it's like it's exciting to work with whatever toys anybody can provide because you know as an engineer I it is you know it's interesting being on the other side where like I have like I have some things that I push for internally since we're actually building some displays and drivers and things like that or I can say well you'd be great if we could change something here it might work out and it's a tough argument to have there because it's speculated when I can say I think this will be you know a great idea and it's one thing for me to just be tossing it over at another company but internally and there if somebody has to think well do we really want to devote resources to this and it becomes the tough trade-offs I go in that way I do think foliated rendering is we still have a lot of debate about the quality of eye tracking I there are systems right now that people can get demos of that do eye tracking info of eiated rendering and there are a lot of people that say they're just not good enough because of the same thing that I've been talking about here the stacked up latency where I if you go through the latency the way these eye trackers work is their cameras their cameras that basically look at the eye and they go through a computer vision processing step and this stacks up to a whole lot of latency there before you even get a value out of the sensor that can be put to the system where you might be choosing to re-render it in different places so we come down to possible ways to bring it in as late as possible into the graphics processor I you know ways to to push things around in different ways and I think that I feel pretty good about it I don't think that's one of the deep unsolvable problems I you know of our industry and and in fact there's the easy the push that valuable right now associated rendering is even without eye tracking it's helpful to do some kind of faux be a turd rendering just because the lenses in the head-mounted display fundamentally distort the view so you have more detail in the center and blurrier on the outside for just the way the optics trace out so if you have low overhead it makes sense even today to render a foliated view and then a blurry or outside view but this is one of those engineering things that can becomes a tough trade-off because if you have a high overhead driver interface you'll spend more time doing a second rendering pass for that then you actually save because you can save on the pixel side or the fragment side but you can spend more on the CPU and depending on your system you might be bound one way or the other so the question is accessibility features and people for various visual impairments so there's some I think see there's probably yeah I just think there's a paper yes so and in fact I like some of the ones like the gear VR as a focus knob where I can wear without glasses and adjust the focus enough or I can just jam it on over my glasses somewhat uncomfortably there are possibilities that are being investigated with varifocal to be able to do displays in some cases for people that have different eye you know different some eye problems now people when they hear that we do what's called distortion correction to correct for the lenses people think immediately that can we use that to correct for for eyeglasses and unfortunately not directly the distortion correction that we do in head-mounted displays now is spatial distortion where most of the problems that people haven't required glasses are focal distortions but there is technology that's being worked on to dynamically adjust focus in different places because that is one of the things that that even if we had infinite resolution in the head-mounted displays today you can tell the problem like I remember doing the early work where I'd be looking at my desk in reality and then there was a similar table in the virtual reality that I was looking at and you take one off in the other and you don't think about it consciously but when you're looking at your desk here everything down there goes out of focus because it's further away while in VR they stay in the same crisp focus and this is also one of the things that causes eye strain where your brain normally does two things at the same time and adjust the focus and it Verge's your eye is to converge on stereo in a head-mounted display your eyes verge to the right place because you've got them drawn into separate places but the focus never changes so that's something that that doesn't cause the sick to your stomach feel but that can give people a headache and cause some of the eye strain so there's hope for some of the things I with varifocal doing some of these possible I am Corrections and then there are some exotic display types that potentially could bypass other eye you know some of the other optical things like for a long time early on I was excited about direct laser retinal projection the idea of directly scanning lasers into your eye because that's I yes and I did build my own laser rig and put it on the lowest possible setting and very carefully peeked under and I had yeah and when I did that the one of the manufacturers is like don't do that your eyes are too precious and theirs lowest power and it didn't hurt myself but I'm but the interesting thing about that is so many of the optical problems that people have that have to be corrected for with glass and so on are matters of focus where you have light coming in that covers your whole eye you know might be coming from a pointed space but it comes out it covers the whole lens and then the lens has to bend it back down to a sharp point on your retina to be read with a laser eye you scan over the laser beam still going by your by your lens but it's fundamentally coming from sort of one point as it goes around there and that has possibilities for being being better for people but even on just a simple level of people that just have really bad eyesight that has a very poor visual capability one of the things that you know I'm kind of excited about is we just rolled out this major change to our kind of our home system and I and our web browser and these other things in VR and one of the neat things is I fought this long hard battle to get our user interface people to accept that putting things on flat surfaces is still the right thing to do even in virtual reality before everybody wanted to do crazy 3d stuff all over but putting it on the surface is nice because it does mean we at least have that possibility I'm just moving closer to the surfer service making it work better for people in some ways but as far as like eyes jumping around we still have debates about how important some things like eyes do what are called cicadas which are very quick quick jumps to different places and I don't know if that's specifically though related to the condition you're talking about but there are theoretical discussions we have where the displays that we use in VR are low persistence I am meaning that they're not continuously illuminated because if you take a normal like if you take an iPhone and put it in a cardboard case and you do VR it's terribly blurry because the LCD is on the whole time and you look around it's just smeared over the entire time between the frames to make the crisp displays we do basically display flashes on briefly and there's two ways that that happens either a global shutter and this has camera analogues to these the same things happens in cameras you can have a global shutter where the entire thing is blasted at you at once which seems to be the best there's no distortions there or you can have a rolling shutter where they come on like the old CRTs and then they're turned off some distance behind it but a lot of people still think that there's some negative that's going to be happening there where what if your eyes glanced around this eyes not seeing that and I don't see that we still argue about whether it is an actual problem but they're still kind of some reasonable debate it definitely causes time distortion so you have to you know if you're if you don't correct for it and you're looking side to side while it's scanning down the whole world we would call it waggling because that 15 milliseconds it takes that time to get down here you look over by the time it scans out it should have been for 15 milliseconds over here so it's like the whole world sheared and if you're going back and forth it's like the whole world is wagging like that but you can correct for that with the same kind of time warp you can say not only am i believing that i'm looking here but i believe at the end of this frame i'm going to be looking slightly over here so i'm going to distort the whole thing and not perfect but really pretty close I can hang around football so there were two for I am and there's an interesting role a story for this where I am so the enemies they they decided that they wanted to start honoring game technology and these are the technical Andes I and it's for like user-generated I know didn't go don't do the red carpet thing who's gonna hire but they they were technically means and they're usually for things like guys you know people would have made advancements in the way the film's work or the camera or the post-production things I and I got to for link-lengths user-generated content accelerated content and game engine technology and so I had two of the the great big Emmy statues and they're massive and they're so like I joked that the combination bludgeoning puncture weapon because they're sharp on the end and they weigh like ten pounds I but the funny thing was like a year a year after that there was a revolt amongst the film and television people it didn't want to be on the same footing as the video game people and they downgraded those and I got another one I like a couple years later that was a lucite thing like you get from somewhere else but but yeah I've still got one of the the big Emmys on the mantle [Music] latency and all and government goal of life [Applause]
Info
Channel: UMKC School of Computing and Engineering
Views: 138,299
Rating: 4.9692745 out of 5
Keywords:
Id: lHLpKzUxjGk
Channel Id: undefined
Length: 80min 28sec (4828 seconds)
Published: Tue May 30 2017
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.