Here in this folder is the smallest possible
executable file, coming in at a mere: 0 bytes? So, technically you can take a blank
file and rename it with the .EXE extension, but if it can't run, it doesn't really count. So what makes an EXE actually executable? To help explain exactly that, I reached out
to my good friend William Gates, who sent me this archival footage from Microsoft they used
to show to new operating systems engineers. Hi, and welcome to the wonderful world of
Windows. Today we’re going to be taking a look at the Windows Portable Executable File format.
Once you've written your program in a high-level language like C or C++ you'll want to throw it
through the compiler, assembler, and linker to produce the final Image of the program in
the form of a Windows Portable Executable, colloquially known by the file extension: EXE. This file contains what is called an “Image” of the program captured in a state ready to be run
by the computer. The PE file contains a series of headers which contain information about the file's
different sections which contain the code, data, and references to external symbols
that are required by the program, with your code now compiled to machine code for
the specified instruction set architecture. For the Windows Operating System to run the EXE
it'll: verify that the file is a valid executable by checking the headers, load the code, data, and
other sections into memory, resolve any of those external dependencies with the proper dynamic-link
libraries (DLLs), and then begin the program's execution from the specified entry point. So if you ever think that Windows doesn't have
enough bloat, just remember that every EXE still contains the string "This program cannot be run
in DOS mode" in the MS-DOS legacy header. But creating an executable isn't as straight forward as C code in equals machine code out.
This simple Hello World program once compiled is 111,104 bytes, that’s 1,736 times larger
than the source code file. But even though this program is only written using 64 bytes
worth of characters, what is it actually doing? It's including the standard IO library,
which relies on the default standard libraries and system APIs provided by the OS which allows
the program to elegantly call the printf function to output text to the console without having to
worry about the details of directly dealing with the said complexities of the operating system.
That’s the key thing to understand, programs are made for systems. Whether it's for Windows, Mac,
Linux, your XBOX, or your homemade smartwatch, that compiled program is explicitly designed
to interreact with that system, and that's great because those operating systems are in turn
designed to deal with the device’s hardware and provide that layer of abstraction that allows
programmers to type only a couple of lines of code and change the pixels on a monitor. But what is the minimum viable program that an operating system like Windows
will load and execute? In other words, what is the smallest possible EXE? That’s a question philosophers
have pondered for ages, and it requires a deep understand of the
technology not just behind how Windows operates, but computers themselves. But don’t worry if
you need a refresh or a crash course in computer science, thanks to today’s sponsor: Brilliant,
where you learn by doing with thousands of interactive lessons, not just in computer science
and programming, but also in math, data analysis, and much more. Each lesson on Brilliant has been
carefully crafted to encourage hands-on problem solving to increase your analytical thinking.
Brilliant can help improve your understanding of real-world topics, like through their How
Technology Works course that dives into the working principles of computer memory,
cryptography, and video compression.
To try everything Brilliant has to offer for free
for a full 30 days, visit brilliant.org/Inkbox or click on the top link in the description. You'll also
get 20% off an annual premium subscription. Again that’s brilliant.org/Inkbox To create the smallest possible EXE all we have to do is create a minimal program, remove
everything that isn't absolutely necessary in the resulting file, and then squeeze it to
make it even smaller until something finally breaks. So to start, I'll create the smallest possible
program right here. Yeah, it’s super boring. Which is why I'm also making a second program that will
create a message box, it’ll be a little bigger, but at least it’ll do something that
will be visually interesting. Now compiling this version of the program normally
with Microsoft’s standard MSVC in 64-bit mode produces a program that is 114,688 bytes. The original
minimal version isn't far behind at 114,176 bytes. But once I set the linker option
to not include any standard libraries, we've already slashed the byte count down
to only 976, and 640 bytes respectively. And to confirm, yes, these programs still run as normal. Now the next step is to ditch the C code
and meet up with a good friend of mine, x86 assembly. And we'll also
switch out from using the Microsoft assembler, to the Netwide Assembler
so that we can write out the entire PE file, headers and all, and assemble it all at once to be
extremely accurate with what our output will be. The message box version now sits at
864 bytes, while the minimal version has briefly bounced back to 848 bytes.
That may seem like we’re headed in the wrong direction, but now it's time to really
start cutting the fat by getting rid of some of the optional and unneeded parts of file.
Goodbye Rich header, goodbye DOS stub, goodbye debug table, and the ASCII art of a shrub.
Message box version now comes in at 416 bytes, minimal version down to 336 bytes.
And this is about as small as the program can legally be squeezed. And by that I mean that
beyond this point there is no guarantee that this program will work with future versions of Windows
because it's time to break the file header. Most of the new savings come from smashing the
PE header into the MS-DOS legacy header. We can't get rid of it completely as Windows still
checks for the MZ at the beginning of the file, but then it just jumps to location 3C to grab
the offset of where the PE header starts. And it just so happens that if the PE header starts at
location 04, then 3C will line up with the section alignment value of the header, which can also be
4. In addition, many of the other values in these headers aren't even used, so I can use them to
store my strings for the message box, which brings me down to an even 400 bytes. And for the minimal
version, I've gotten it as low as 268 bytes, but it's here that I've hit the proverbial bedrock.
There are actually still a bunch of junk 0's at the end of this file, theoretically it could come
down to 210 bytes with what I have now, because normally you can just delete the trailing 0s since
the file will be loaded into memory on a page full of 0s. But no matter what I've done, it just
won't load the file if it’s less than 268 bytes. It seems that several sources confirm that 268
bytes is the smallest possible executable size that will run on 64-bit Windows. Why? Well, that might be a better
question for Dave. However, for other versions of the Windows,
executables can get as small as 97 bytes! But again most these programs,
don't really do much, although in the 64-bit Windows version
since it’s stuck at 268 bytes, some people have actually squeezed in a
couple of DLL calls to create a message box, or open the windows calculator, so there
is some room to get things done, but is it enough space to fit a whole game into here? No. Well, at least not a good game. I’m not sure typing simulator would become a best
seller on Steam. But how small could you squeeze a game? First of all, I won't be
able to include any standard libraries, this game will have to rely 100% on whatever the
Windows API provides. So it's time we meet another good friend of mine, the kernel, Kernel32. Kernel32.dll is one of the core DLLs that provides functions for interacting with the Windows
Operating System. It can provide everything from memory management, process management,
file system access, and certain IO. And it really shouldn’t be too
hard to build a game using these basic Kernel provided functions because that’s literally what
every game, and every program, does at its core. To start the game, I’m first faced with two main
options for the program’s subsystem, that is, the type of Windows environment in which the
program will be executed. Either a console, or window. Going with the latter would allow
for more graphical control, but I'd have to explicitly create the window myself. Whereas
if the subsystem is set to console, then a console is automatically provided for me if the
program isn’t run through the terminal already. But wait! Consoles aren't for games. Consoles
is only for slow asynchronous I/O. Well yes, that can be true, but it doesn't have to be. In
fact, your precious console which pretends to act like a fancy text editor is mostly a well-oiled
illusion, any keyboard input that is echoed as character output is done so quite on purpose.
Before getting into I/O, we'll first need get the 'handle' of our console, which is a token
that represents a system controlled resource. Our console has both an input and output handle
that we can get by using the Kernel's GetStdHandle function. To use this in our assembly code,
we can create an external reference to it, then push the parameters of the function to the
stack, then we'll CALL that function. Using the Microsoft Macro Assembler, we could also use
the INOVKE keyword to pass on the parameters like this, which looks more like calling
a function in a high-level language, but it's just a shortcut for what really happens.
If we pass -10 to the GetStdHandle function it’ll return the STD_INPUT_HANDLE of the console,
and -11 will give us the STD_OUTPUT_HANDLE. We'll keep both of these in a variable since
we'll be doing both input and output later. Output is much easier to achieve as we can
use the kernel's WriteConsoleA function to pass our output handle, pointer to the buffer
containing our output ASCII string, the number of characters we'll write in our string, a pointer
to the DWORD that returns the actual number of characters written, and then one more zero. Doing
that, we can already write output to the console, and here's the first trick up my sleeve: Color.
Now, you could use a couple of other function from Kernel32 to change the current color of the
console's characters, but back in 2016 Microsoft reintroduced support in the console for the old
ANSI escape sequences, which are codes made up mostly standard ASCII characters that can
control terminal formatting. Everything from setting the cursor location, text color, console
size, console title, and a whole lot more.
However, to ensure it'll work properly, I do
have to use the SetConsoleMode function to ensure that virtual terminal processing is
enabled. But after that, I can be quite sure that ANSI escape codes like this will be able
to modify the console's output formatting just by calling the same WriteConsole function, which
will save a lot of program space to not have to use functions like SetConsoleCursorInfo, which I
would have called to make the cursor invisible, but this ANSI string has the same result.
As for the game world, I'll take 48 bytes of data and for each 0 in those bytes
output a sky character, and for each 1 a meteor. The sky is the Full Block ASCII
character, with text color set to blue, and the meteor is the lower half block in dark
yellow, with the background set to red. Then after every four bytes I write the newline character
to the console which creates this 32x12 grid. For the protagonist I wasn't sure
which ASCII character to go with, but I narrowed it down to these three and
decided to give the house a try. I can use the SetConsoleCursorPosition position function to set the
Y of the cursor to the bottom row of the world, and the X position to the player's X value from 0
to 31, then call the same WriteConsole function to output the player's character. There is a bit of
flicker here as the background and player are both reprinted sequentially at around 60fps. To fix
that I could stick these into an output buffer before writing to the screen, but to minimize the
amount of code in the final program, I won't.
To get user input from the console, I can
use the ReadConsoleInputA Kernel32 function and that returns an INPUT_RECORD, which is a
structure that contains different event types, such as when the window is brough into focus,
cursor input, or the only thing I’m interested in: KEY_EVENTs which will then attach a
KEY_EVENT_RECORD struct, that contains what I'm looking for,
the ASCII character of the keyboard input. Now, let me test if this code
is working correctly here, it should echo whatever the user
types back out onto the console. OK, and that works twice as well as I expected.
So it looks like the issue is that there is a KEY_EVENT for both pressing a key and releasing
a key, so I also have to check this part of the KEY_EVENT_RECORD that tells us whether the key
is being held down or released. And... that's more like it now. Although it is a little
sketchy since the non-ASCII keys are also outputting character data, here's the output of
the character chart from 0 to 255 for reference, but that shouldn’t be anything to worry about
since I'm not actually going to be outputing anything the user types anyways, I'm just
checking whether the user presses the left or right arrows keys and then moving the
player along the bottom row accordingly.
Then every few frames the meteors will move down a
layer, and a new layer of meteors will be randomly generated. And without a standard library to
just make random numbers magically appear, there are a few options I can go with to implement
that same magical functionality by using the Windows API, though none of them
are found in the Kernel32 library but in the advanced API services library. One method is to use the Microsoft Cryptography
API, but that requires quite a few functions to set up, and I'm trying to keep it minimal
so I'll go with the RtlGenRandom function, also known as SystemFunction036. That's literally
what I have to import it as in the program. But all I need to do is specify the destination buffer
and how many random bytes I want, and then I can take 4 random bytes AND them with four others to
try to save some space between the meteors, then add those to the top layer.
Once the meteors reach the bottom layer, I can take these last four bytes, throw them into
a single 32-bit register, then roll the bits to the left 32 minus the character's X position
times to load the object at the location that player is standing into the carry flag,
again 0 for the sky, or 1 for a meteor. If it's a 1 then the player has been squished
by a meteor and it's, as the kids say: game over. I've added a points counter to the
top using four packed BCD bytes, that's two digits zero to nine per byte,
and I can make great use of the x86 Decimal Adjust AL After Addition (DAA)
instruction to correctly perform BCD addition, big thumbs up for CISC, right guys? This score gets output to the screen every
frame, followed by the world and the player. After drawing the frame, checking
the input, and running all the game logic, I can use the Sleep function to suspend the current
thread for a number of specified milliseconds. I have it set to 16ms, for a target of 60fps,
but of course that doesn't account for program execution time, so it will be slower
than that. I could use the GetLocalTime function instead to see if 16ms has passed since the last
frame was drawn, but that would cost a few more precious bytes of data, so again, I won’t. Once the player is crushed the program will
use my favorite Kernel32 function: Beep. And I just have to read this description
from the Microsoft documentation here: A long time ago, all PC computers
shared a common 8254 programmable interval timer chip for the
generation of primitive sounds. But then everything changed
when the sound cards attacked. I may have added that last part there,
but this is still a cool function, you can just drop in the frequency that you
want and duration of the sound, and then [beep] You could do something really cool with this function
like make the world's smallest digital piano. [Playing Tetris Theme Song] Granted it’s not a good piano,
but it is the smallest [beep beep beep beep beep beep beep beep] Fun side quest, but back to the game at hand,
after assembling with MASM the program weighs in at 2448 bytes, and after really trying
to cut the fat without breaking the headers, because I do want this to be playable on
future systems, I've got it down to 2358 bytes. And the last piece of the puzzle is
another good friend of mine: the Cinkler. Crinkler is a compressing linker for windows,
it takes the place of the standard windows linker turning our object file into a compressed
version of an EXE which then decompresses when you run it. It's not cheating, because I make
up the rules here. So, after crinkling my game, it manages to sneak into the
under 1K club at 825 bytes. Here it is after I turned it into a
QR code hanging on my fridge. And by the way, the world's smallest
piano weighed in at 1584 bytes before crinkling, and 555 bytes afterwards. So the only question I have now is:
what is the largest possible EXE? Theoretically, a PE32 file could be
no larger than 4GB since that's the 32-bit addressing limit, but for jump instructions
it would have to use a signed 32 bit digit, so the limit should be around 2GB.
PE32+, the 64-bit version of the file, does have a special flag to say that it
can handle addresses larger than 2GB, However, we're once again
limited by Windows itself. it doesn't seem to be able to handle
executables, both PE32 and PE32+, that are greater than around 1.85 gigabytes.
Of course, if you've opened Chrome in the last few years, you'll know that programs can take
well over 2, 4, or even 8 gigs of memory, basically they'll just take whatever you throw at
them, but I'm only talking about Windows loading a large executable, I don't care how big it gets
once it's open. And for the record, Chrome.exe is only 2.66MB, Steam.exe is 4.18, Excel.exe is
66MB, and even Photoshop.exe is only 160MB. As for the exact maximum number of bytes that
Windows can take, I simply took one of the last versions of the program we were shrinking before,
and threw between 1.8 and 2GB worth of 0s on the end and did a binary search to find
which would load and run, and which wouldn't. 1,996,488,704 bytes. Completing the entire range of runnable
executables on 64-bit Windows. Thanks once again to Brilliant, check
them out down below, and until next time…