JAKE ARCHIBALD:
Hello, I'm Jake, and I work in Developer Relations. This means I live
in constant fear that my developer
skills are going to rot and fall off, because I spend
too much time doing stuff like this rather than
building actual real stuff. This is why when someone
in Dev Rel builds a thing, we won't shut up about it. It's our proof to the
world that we still got it. We're still cool. We're still one of
you, a developer. And on that note,
look what I made. It's a little
responsive web app that lets you search for and
read Wikipedia articles. Now I know what you're thinking. Hasn't this already been
created before by Wikipedia? Well, yes, shut up. Forget about that. That's not the point. I want to talk
about performance. First up, let's
immerse ourselves in the current load time. Ready, setty, go. That wasn't so fun. That was the load time
of one of the articles on a 3G connection. It's important to watch
the 3G load times, because even though we
have 4G now, those users are on 3G or worse
a lot of the time, a quarter of the time
in the US, half the time in large parts of Europe. So here's our problem. We saw 2.7 seconds of nothing
and a further 2.1 seconds of basic interface without
meaningful content, just a toolbar and a spinner. Even on 5 megabit, we're waiting
over two seconds for content. As users of the web, we
know this kind of load time is a bad experience, but
that bad experience directly impacts download conversions,
donation conversions, and outright revenue. And there are some
studies that you can throw at the money people to
convince them that performance really does matter. I'm going to show you how
you can slash the load time of something
like this, and we'll add in some cool new features
along the way as well. So here's the markup, roughly. It's got CSS, JavaScript,
and nothing else. I'm relying on JavaScript
for all my rendering, which is kind of bad. So don't do that. Our initial render
is pretty static. So let's do it
without JavaScript. So we'll add some markup
in for the title bar and mark the
JavaScript as async. Now it won't block rendering,
and it will execute whenever it finishes loading. Doing this knocks around half
a second off our first render time on 3G. And the bigger
your JavaScript is, the bigger gains you'll
see with this fix. But we're not done. We need to prioritize our CSS. We can't render until all
of our CSS is downloaded, but we only actually need
a tiny fraction of it for the first render. So we'll do this. We'll inline the bits
for the first render and then load the rest
asynchronously using JavaScript. The Filament Group created
loadCSS to do just that. It's a tiny script that you
can inline in your page. So that's what we'll do. We'll hide our article
element so we don't get a flash of unstyled content. We'll load our CSS,
and once it's ready, we'll show the article. This is a huge win for
slower connections. Only 1.4 second of
blank screen on 3G, that's a huge improvement. And the bigger your CSS
is, the bigger gains you'll see with this fix. Now, I realize there's been a
lot of code and graphs so far, and that actually goes
against the guidance we've had for creating these videos. So to address the
balance, here are some pictures I took at a zoo. [MUSIC PLAYING] Welcome back. So we're down to 1.4 second
on 3G, but all we've improved is the time to this,
not the actual content. Let's fix that. Our bottleneck is once
again our JavaScript. You see, the browser
makes a request. It gets back a
page, and that page tells the browser to go fetch
some JavaScript and CSS. And then that JavaScript
tells the browser to request the
article data, which we get from Wikipedia's
API plus a few alterations. You see the problem? We've made two back and
forths before we even think about downloading the content. This is super inefficient
and a big problem with JavaScript-rendered
sites, particularly those created with frameworks
as the JavaScript tends to be pretty big. Instead, let's render
the page on the server. So the request goes out,
we compile the content on the server, and
send back plain HTML. So how much quicker is that? It is worse. Can we cut? [MUSIC PLAYING] OK, OK, I figured it out. Wikipedia is a bit
of a bottleneck. Our API request to them takes
around 900 milliseconds. Probably because Wikipedia
contains five billion articles covering quantum physics,
the rule of threes, and they're being access
thousands of times a second. But you might run
into the same problem with many third-party APIs,
maybe even certain database requests on your own server. So our server gets the request,
it goes off to Wikipedia, takes that 900 millisecond
hit, and only then does it send stuff
back to the client. In the meantime, the user's
left looking at a blank screen. But there's a better way. We fix this by streaming the
response using chunked encoding or multiple data frames
if you're speaking HTTP/2. This allows us to
start sending the HTML before we have
the whole content. So we respond immediately
with our header and toolbar. That gets is this
fast first render and lets the browser know about
the JavaScript and extra CSS. Then as we get content
back from Wikipedia, we can transform it and
send it on to the browser. This is quite easy with a
no-js or golang backend. With no-js, I can just
call write whenever I have something worth
sending, or I can pipe a stream to the response. There's also the Dust.js
templating language. I don't much care
for the syntax, but it supports streaming. It'll output as much
as it can until it encounters a
template value that's either a promise or a stream. And then it'll wait
for that promise to fulfill or pipe the stream. And the result-- we fixed
our first render time and massively improved the
content rendering time. Let's look at that side
by side with the first JavaScript-driven iteration. We'll set them off
at the same time, and you can see the difference. We are now web
performance winners. [MUSIC PLAYING] But wait, what about the second
load with our populated cache? Currently cache
load times are not dissimilar to normal load times. Our bottlenecks are
making a request to the server and the server
getting data from Wikipedia, and that's the best case. We cannot rely on the browser
cache for performance. Stuff falls out of the
browser cache all the time, or we as developers invalidate
it by making code changes, because that's our job. Also, there's a connection
type we haven't catered for. No, not offline, this. I call it Lie-Fi. Offline? Offline is OK. At least it's honest. Can I fetch this? No. Can I go here? No. Can I do this? No. Lie-Fi is like offline,
but it trolls you by pretending to be online. It'll attempt to make a
connection for minutes and still fail. Let's fix this. Let's take control of
the cache and page loads using Service Worker. Now I'm not going to dive
into the ServiceWorker API. There's an HTML5 Rocks
article for that. But here's the concept. During the first
server-rendered load, we register for a ServiceWorker. Then it gets everything
it needs from the network to render a page-- the CSS,
JavaScript, and basic page shell. Then it puts them in a cache. Now, unlike the
standard browser cache, items aren't automatically
removed from this one. For the next page
load, we're going to go back to rendering
on the client, but this time, it's supercharged
by the ServiceWorker. The browser requests an
article, and the ServiceWorker responds with the HTML,
CSS, and JavaScript, and this is super
fast as it doesn't require the network at all. The connection type
doesn't even matter. It's all from a local cache. Now the page asks
for article content. This delay made our
client render slow before, but the ServiceWorker
preempted this request along with the initial page, and
it's already on its way. This absolutely slashes
our first render time to almost instant, but our
content render time kind of suffers. Remember the problem we saw
with our first server render? Well, we've kind of just
recreated that on the client. Our JavaScript pulls down
the full Wikipedia article before it puts it on the page. We're losing time here,
because we've got some content, but we're not showing
any of it to the user. Over the next year, you'll see a
new API learn to fix all this-- the Streaming API. Parts of it are landing
in Canary already so we can make some use of it. Here I fetched the
article, but instead of getting the full text,
I get a screen reader and start siphoning off
the content as it arrives. I write the result once
when I have to first 9K, and then I write again
once I have to rest. Writing it to your HTML twice
like this is kind of hacky, but as streaming APIs
land in the browser, we'll get access to the
proper streaming HTML parser. But even this hacky solution
has improved things. We've retained the
quick first render, but now our content
render is much better. But now that we have
a ServiceWorker, we can make even
greater use of it. The final step-- if we've got
ServiceWorker caching assets, why not let it cache articles? You could cache
articles automatically, but I'm going let-- [DING] --the user decide. With a full cached article,
the content load time drops into under
a half a second. Not only that, it's that fast on
Wi-Fi, it's that fast offline, and it's that fast on Lie-Fi. We don't leave users
with old content either. When the user looks
at a cached article, we can then go to the
network in the background and look for updates. If we find some, we can just
update the content on the page. [DING] When swapping
content on the page, we need to ensure it's not
disruptive to the user. Wikipedia changes are usually
small so it isn't particularly risky here, but we could detect
bigger changes in content and instead show a
notification inviting the user to click something in
order to see the updates. These are the things that make
the difference between a web app and a great web app-- get to
first render before JavaScript, render with minimal
inline CSS, render on the server with
response streaming, leverage the ServiceWorker for
caching your content shell, and even use it for
offline first content. This is how we
make the web fast. You can check out the
Wikipedia demo on GitHub, and if you're interested
in other smart uses of ServiceWorker, check
out the offline cookbook, SVGOMG, Trained to thrill,
and the Google I/O website. And next time
someone from Dev Rel shows you something they've
made, give them a hug and tell them they're a true
developer just like you. Seriously, we need this. [MUSIC PLAYING]
Why is this video unlisted?
This was really helpful!
Awesome video! This methods will come in handy in my next side project!
Removed in protest of Pao!
If you would like to do the same, add the browser extension TamperMonkey for Chrome (or GreaseMonkey for Firefox) and add this open source script.
Then simply click on your username on Reddit, go to the comments tab, and hit the new OVERWRITE button at the top.
Great ! Now someone make an automated process of all of this in Grunt/Gulp plz
We need more videos by this guy!