I just love isometric games. Maybe it's just nostalgia, but there's just
something about the hand drawn art style with the illusion of dynamic 3D that's just really
appealing. In this video, I'll show you how isometric
coordinates work, and I'll take you through the math to project an isometric grid into
screen coordinates and back again. Let's start with the simple case. In a top down 2D game, we might have a sprite
asset like this. It's easy enough to tile the same asset across
the screen - just using multiples of the tile size - and everything looks fine. It's when we add some perspective that things
get a little weird. If we draw a 3D box on a grid like this, it
looks fine. But if we now move that same asset around
on the grid, it doesn't look quite right. This is because of perspective - things closer
to the point of view looks bigger than things further away. This is a problem for hand drawn art because
we would need to draw infinite assets for infinite positions. But, if we just move the camera back really
far away, and zoom in at the same time, the perspective becomes virtually unnoticeable. In fact, if we move the camera back *infinitely*
far away, there's zero perspective distortion. We call this an orthographic projection. The key here is that all the lines are parallel. Let's take a look at a hand drawn isometric
cube sprite. Taking a closer look, you can see where the
grid would line up against this cube. For every two pixels on the horizontal axis,
we move vertically by one pixel. Now this cube looks fine, so what if we try
to tile it like we did before? This would make a nice bathroom wall, but
it's not quite what we're going for. The problem is that we need to transform the
coordinates of each sprite to align it with the isometric grid. We'll start with our isometric grid, and draw
two lines - one to represent the length of a single grid tile in the x axis (i hat) and
one for a single grid tile in the y axis (j hat). What we need to do is visually *distort* this
grid - first we'll rotate it by 45 degrees, then squash it in half. The original grid is shown underneath, because
we'll need that next. To represent this distortion mathematically,
we now need to measure i against the original grid. `i` moves along the x-axis by a single
tile still, but it now also moves down the y-axis by half a tile. We can write that out as a 2D vector, or two
numbers. Following the same process for j, we see that
is moves in the negative x direction by one tile, and down by half. Now these four number are everything we need
to describe the distortion of our grid. We can now multiply any x coordinate on the
isometric grid by `i hat` and the y coordinate by `j hat`, then add them together. For example, say we wanted to find where grid
coordinate x 3, y 1 would appear on screen. We multiply 3 by `i hat` and 1 by `j hat`. This gives us a new 2D vector, representing
the screen coordinate. So applying that math to our coordinates,
it looks.. not quite right. That's because just like in the top down example,
we need to account for the size of our grid tile. In our `i` and `j hat` vectors, we could multiply
each x value by the width of the sprite, and each y value by the height of the sprite. Better, but there's now too much space between
each sprite. If we draw our sprite over the top of the
grid, the problem becomes clear. The size of our sprite actually takes up *four*
tiles on our reference grid. So we actually need to halve both the width
and height of the sprite to get a value representing a single reference tile. Simplifying that we now get a new set of numbers
to represent the transformation. And, much better. But we're not quite done. You'll notice at the top left, the origin
of our isometric grid doesn't quite match the top left of the screen. This is because the origin of each tile actually
appears in the centre of the sprite, not the top left. So we need to account for that by offsetting
everything by half the width of a tile sprite. And finally, we can offset the origin of the
grid to the centre of the screen. Now we're displaying an isometric grid, but
how about interaction? If we want mouse interaction, we'll need to
go the other direction - to figure out which tile is being selected from the position of
the mouse. Fortunately, maths saves us again. We already have our four numbers representing
the transformation from isometric grid to screen, let's name them a, b, c and d for
now. If we write them out in a grid like this,
we have a matrix. And the nice thing about this is that there's
a rule for inverting a matrix - put simply, converting it to a new matrix that does the
opposite. So by simply reordering the numbers, changing
the signs of two of them, then multiplying each number by the determinant (that's the
bit out the front), we can get a new set of our four numbers that works in the opposite
direction. Then we just multiply the cursor coordinates
by the new reversed `i hat` and `j hat` and that gives us the specific grid coordinate. And finally, to add some depth we can offset
the screen `y` coordinate by some amount as a sort of third dimension. I hope you found this useful. If you want to go into the math in a bit more
detail, check out 3Blue1Brown's excellent video on linear transformations and matrices. That's it, and I'd love to hear from you if
you do build something with this.