Hardware Basics

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

the hardware of a computer system can usefully be described as composed of basically three parts first you have the CPU the central processing unit second the system memory and third what's called IO short for input/output which comprises everything else your hard drive of your display monitor your keyboard your mouse everything else other than the CPU and the memory all of these components of course need to get hooked together so we plug them into what's called a mainboard or sometimes called the motherboard the mainboard provides power to the components and communication pathways between them this productive description CPU memory and i/o really does apply to any computer though we sometimes make distinctions between different classes of systems for example what's called a client or sometimes a workstation is a single user system like a desktop or laptop what's called a server often has no direct human interface devices like a keyboard or monitor because instead servers mainly communicate with other systems over the network there is truthfully no hard and fast distinction between client and server systems you could stick your laptop in the backroom and use it as your server however clients and servers typically have different performance needs whereas the desktop might have a fancy graphics processor to play games the server might have a large amount of memory to better store a large database handheld systems like smartphones and embedded systems like your car's computer are also comprised of the CPU memory and i/o devices again the distinction is mainly in the form factor the size and shape of being casing and also in the performance needs for example your toaster doesn't need much processing power and your smart phone needs power efficiency much more than your desktop PC the term mainframe today is at best nebulous and at worst totally obsolete the large majority of mainframes still sold today are sold by IBM and typically these systems are about the size of a large cabinet unlike most servers in use today which have a standard PC architecture mainframes have custom architectures optimized to handle a lot of network traffic and data processing mainframes have always been used only by larger organizations but even large organizations have increasingly moved away from mainframes in the last two decades because they could get the same needs met using an array of cheap commodity server pcs the term supercomputer is also not as meaningful as it once was the term supercomputer simply refers to computer doing a whole lot of computation most commonly for the purpose of scientific simulations like say modeling weather patterns a few decades ago supercomputers were always very specialized hardware but in the last two decades the trend has been to build supercomputers out of many off-the-shelf processors thrown together into one system now getting a bunch of processors to work together in one system efficiently is not a trivial matter but it's considerably easier than it once was in fact you can build a passable supercomputer today by networking a bunch of commodity pcs together to form what's called a cluster of all the components in the system the RAM the system memory is the simplest Ram is really just a big bucket for storing bits and as far as the CPU can see these bits are organized into bytes each with its own address a numeric value that uniquely identifies that byte the first byte has 0 0 the second byte has the address 1 the third has addressed 2 and so on all the way up to the last byte so if the CPU wants to read or write a byte and Ram it specifies the byte by its numeric address a notable characteristic of Ram is that it's volatile meaning that as soon as a ram chip loses its steady stream of power the content of the RAM gets scrambled none of the bits can be relied upon anymore because without power they get flipped unpredictably what this means in practice is that when you turn off your computer the contents of RAM are lost nearly instantly the next time the computer is turned on the contents of memories start off in a total jumble has a bunch of random garbage so this explains what the operating system must be loaded from disk every time you power on the system despite this annoying characteristic of volatility we use RAM chips for system memory instead of say a hard drive because Ram chips are much much faster to read and write as code runs we want the CPU to access the instructions as fast as possible so usually we copy all of a programs code into memory off of slower storage mediums like hard drives before running the program as a code runs we want the CPU to access the instructions as fast as possible so usually we copy all of a programs code into memory off of slower storage mediums like hard drives before running the program Ram can also be used to store any data created by a program as it runs but of course anything which the program wants to store permanently must be copied to a non-volatile storage device like a hard drive if we wanted to fully understand how CP was work we would have to get into a lot about circuitry and electricity as well as material science to understand how CPUs are manufactured as programmers though we don't really need to know how CPUs work as long as we understand what they do what programmers care about in regards to CPU is its so-called programming model the abstraction presented to programmers that elides over the messy details of circuitry and voltages and so forth so first of all what a CPU does from the programmers perspective is execute binary instructions which are sequences of bits typically around 8 bits in size on the low end and around 256 bytes in size on the high end with most somewhere in between the way to think of these instructions is that the CPU is hardwired to read one instruction after another and hardwired to act upon each instruction differently for example if the binary sequence 1 0 1 1 0 0 1 1 denotes the start of an addition instruction the CPU is hardwired to perform an addition operation when it reads an instruction starting with that sequence again how exactly this works in circuitry is not something we'll concern ourselves with here the binary sequence which denotes any particular instruction is largely arbitrary and so different CPUs understand different sets of instructions for example the binary sequence denoting an addition instruction on one CPU may not be a valid instruction on another CPU in fact one CPU may have instructions which another CPU does not have at all for instance some simpler CPUs have no instruction for performing multiplication so code on those processors must perform multiple additions to get the same effect as a multiplication operation still every CPU needs instructions for a few essential tasks first every CPU needs one or more instructions for copying bytes from one location to another mainly from one part of memory to some other second every CPU needs instructions for doing basic arithmetic at the very least addition and negation third every CPU needs instructions for performing logical operations namely the operations not and/or and exclusive or which will explain in a later unit the gist of it is that with these instructions we can arbitrarily manipulate the individual bits of bytes such as setting a specific bit to 0 or 1 for every CPU needs instructions for performing jumps as we describe the CPU is hardwired to constructions and it does so one after another starting from some place in memory and reading the instructions there sequentially a jump instruction orders the CPU to execute instructions at an address specified in the instruction to effectively jump from one placing code to another crucially CPUs need jump instructions which are conditional meaning that they perform the jump only if a certain condition is true for example a conditional jump instruction might performance jump only if all the bits and about a memory are 0 what conditional jumps effectively enable programs to do is make decisions to branch down one path of code or another based upon the state of data a register is a small volatile data storage area inside the CPU the CPUs registers can be categorized into two kinds status registers and general-purpose registers a status register stores data that affects the operation of the CPU for example some CPUs operate in different modes and so such a CPU will typically have a status register in which the bits designate the current mode for another example every CPU needs to keep track of the memory address for the next instruction to read and they do so in a status register called the program counter in fact what the jump instruction really does is modify the program counter thereby causing execution to jump to the new address found there in contrast to a status register a general purpose register is for storing any data again though these registers are typically very small around 16 to 128 bits in size and even high-end CPUs contain only up to a few hundred general-purpose registers while low-end CPUs may contain only three or four today's x86 processors contain a few dozen registers which is on the low end for high performance processors so if the CPU registers can't store much data why do we have registers well most CPUs can only perform operations on data in the registers not in memory edition instructions for instance generally can only add numbers in the registers not numbers out in the system memory so to add two numbers in memory we must first copy them to separate registers and the result of the addition operation gets written to a third register from which we may then copy the result out to memory if we wish this then is the general pattern of code to do work we need to bring data from memory into the registers before performing an operation then copy the result of the operation out the memory as needed in fact most CPUs don't even allow us to directly copy bytes from one part of memory to another instead we must copy the bytes from memory to registers then copy those registers to the desired destination in memory so a CPUs programming model primarily consists of this instruction set the precise set of instructions which the CPU is hardwired to understand and its set of registers their number their sizes and their purposes together these two facets of the CPU are often called it's ISA it's instruction set architecture if you have two processors which both support the same ISA then they both can run the same code even if the processors were made by different manufacturers for example the PC platform is built around Intel's x86 instruction set architecture and for a long time up until about the mid-1990s the only x86 processors you could buy were made by Intel itself but then AMD came along and started producing x86 processors as well so x86 is one very dominant ISA today but there's also arm which is very popular today in mobile devices some other successful ISA is include MIPS and motorola 68 km instruction set architecture tends to develop and grow over time the x86 is a for instance started in the 1970s but has evolved since as Intel and AMD have released new processors the newer processors support the instructions of their older ones and they have all the same registers but they also support additional instructions and include additional registers so x86 is more accurately an umbrella term that covers a family of continuously compatible ISAs to illustrate when Intel released the 386 processor they designated that upgrade of the x86 ISA as ia32 as in intel architecture 32 and when Intel released a 64-bit processor in the mid-2000s the new upgraded ISA became commonly known as x86 64 but again these upgrades preserve the old instructions and registers so say if you get a new Intel 64-bit processor today it'll still run code written for ia32 you're probably familiar with the Jonathan Swift's book Gulliver's Travels or at least you've probably seen the Disney cartoon apart in the book not depicted in the cartoon is that in the land of Lilliput the Indians are at war with the Little Indians over whether to crack eggs from the big end or from the little end the joke being that the choice is totally arbitrary CPU designers have a similarly arbitrary choice to make concerning how the bytes in a register get copied to and from memory say we have a 32-bit register with the bytes in hex 0 a0 b0 c0 D as a binary number 0 a is the most significant byte the byte representing the most significant digits the question is when we copy the register contents to some address n in memory do we copy the most significant byte 0 a to n 0 B to n plus 1 0 C 2 n plus 2 and 0 D 2 n plus 3 or do we copy in the opposite order copying the least significant byte 0 D 2 n 0 C 2 n plus 1 0 B 2 n plus 2 0 a 2n plus 3 a CPU that starts with the most significant byte uses the big endian scheme and a cpu that starts with the least significant byte uses the little endian scheme in both cases the order is maintained when copying from memory to a register so for example in the begin daeun scheme copying the 4 bytes at address n to a 32 bit register copies the byte a 10 to the most significant register so if we copy a registers contents to address N and then copy from address end back to the register the register contents remain unchanged now many sources you might read insist of a choice between big endian and little endian is completely arbitrary that they make equal sense but don't listen to those sources the arbitrariness holds if we imagine memory as a vertical array of bytes because who's to say whether the digits of a number written vertically should go up or down and who's to say whether the addresses should increase numerically up or down but if we imagine memory as a horizontal array of bytes the addresses must increase left to right unless we wish to go against all Western convention likewise all Western convention tells us to write numbers with the most significant bytes on the left so if we think of memory horizontally it makes no sense whatsoever to use little-endian unfortunately for historical reasons relating to performance until and some other CPU makers chose to you is a little endian scheme those reasons no longer makes sense with modern hardware but the x86 architecture is still stuck with little-endian byte ordering the various storage locations in a computer can be pictured as forming a hierarchy as we go up the hierarchy the speed goes up but so does the cost per byte and so we need slower storage to for larger storage for example if an entire program could fit in the processor registers we wouldn't need system memory but even the smallest programs won't fit in just the registers and of course both CPU registers in system memory are volatile so we need hard drives for non-volatile storage even though Ram is much faster than hard drives and other non-volatile storage Ram is still relatively slow compared to the operations of the CPU therefore another level of storage is used in between the processes and registers and RAM called the CPU cache this cache uses a form of memory called SRAM short for static RAM as opposed to the DRAM dynamic Ram used for system memory unlike DRAM SRAM doesn't require a refresh cycle to keep its content allowing SRAM to be read and written significantly faster moreover the processor cache is often placed on the CPU die itself allowing for faster access by the CPU on the other hand as RAM is significantly more expensive per byte and so the typical x86 CPUs of 2013 have around 8 to 16 megabytes of cache much smaller than the 4 or 8 gigabytes of RAM typically found in the same systems the unique thing about processor caches is that programmers typically have little to no direct control over them instead the content of the cache is managed by the hardware transparently to the programmer when your code reads an address of system memory the CPU first checks if an up-to-date copy of that byte currently sits in the cache if so the CPU can read the byte directly from the cache without reading from slower system memory if an up-to-date copy of the byte does not already sit in the cache the byte is copied from memory to the cache before it is read such that it might still be there the next time the CPU wants to read that address because the cache is much smaller than system memory only copies of small parts of system memory can fit in the cache at any moment consequently when data from a memory address is copied to the cache a copy of some other memory address must get overwritten it's up to the hardware to automatically decide what to overwrite now the usual pattern with code is that when we read data at an address of memory we're very likely to read nearby addresses as well for example when reading 100 bytes at address n you start with by 10 then read by 10 plus 1 then n plus 2 then n plus 3 and so on because this pattern of locality is so common most caches are designed to over eagerly copy bytes from memory meaning that when the CPU reads an address from memory the cache systems will copy not just the data at that address to the cache but also the data of the surrounding bytes say 1,000 of them or 2,000 or maybe more this way memory access is optimized because it's generally faster to get a chunk of bytes from memory in one read rather than to get each byte one at a time of course if the running code doesn't need the other bytes anytime soon the extra work is a waste but the strategy does usually pay off on systems with this cache behavior a very effective optimization strategy is to maximize locality to keep the bytes of your code and data as next to each other as much as possible if the bytes of memory needed by a processing heavy part of your code are scattered far away from each other they're less likely to all be in the cache at once meaning that the CPU is more likely to have to wait for reads of memory as it does the work for two electronic devices to communicate there must be some common storage area which they can both read and write when the CPU and an input/output device communicate they do so by both reading and writing registers in the device the relationship though is one way the CPU is in control reading and writing registers of the device as it pleases but the device cannot read and write the registers of the CPU so when a device wishes to send a message to the CPU it writes to its own registers with the expectation that the CPU will read the data at some point in many cases input/output devices communicate indirectly with a CPU through a controller device USB devices for example talk directly to a chip called the USB controller on your mainboard which in turn talks directly to the CPU what are the CP instructions for reading and writing device registers well in CPUs which use what's called memory mapped i/o some memory addresses specify device registers instead of bytes of system memory here for example the diagram depicts a system in which the memory addresses from 0 to ffff are mapped to device registers instead of bytes of RAM such that the first byte of RAM actually starts at address 1 0 0 0 0 also notice that a system very well may have more memory address space than it has bytes of RAM and so some memory addresses may not map to anything anyway with memory mapped i/o we can read and write the device registers using the very same copy instructions used to read and write bytes of system memory in this example any copy instruction reading or writing an address on the range 0 to ffff reads or writes some device register rather than some byte of RAM other systems however use port mapped i/o in which a separate address space so-called ports are used for device registers in this arrangement reading and writing the device registers requires distinct input and output instructions for example an instruction for writing to a device register might look something like output register 2 to port 4 4 9 8 whether a system uses port mapped or memory mapped i/o or even a combination of the two as x86 systems do the next question is how a program is know which addresses and ports are mapped to what on some systems these mappings are all hardwired and thus documented by the hardware makers but on other systems including pcs many mappings are dynamically configured at system startup such that certain devices at fixed ports and addresses are used at system startup to discover the ports and addresses of the remaining devices either way most programmers don't really have to worry about device ports and addresses because as we'll discuss later direct communication with devices is handled by the operating system as an i/o device operates and may periodically want attention from code running on the CPU the simplest strategy to provide this attention is called polling in which the code on the CPU periodically checks the device registers to see if the device wants attention the obvious problem with polling is that these periodic checks waste the CPUs time when the device needs no attention so far better if the device could directly notify the CPU when it needs attention which is the idea behind interrupts an interrupt line is a circuit path running from the device to the CPU over which the device can signal to the CPU that it wants attention when receiving the signal the CPU temporarily sets aside what it's doing to run the interrupt handler a piece of operating system code associated with that interrupt line the operating system stores a list of addresses for these handlers in the interrupt table and the CPU keeps the location of this table in the status register when an interrupt signal is received the CPU is hardwired to copy the current program counter to memory so that it can later pick up where it left off the CPU then looks in the interrupt table for the handler address associated with the interrupt line eg find zero corresponds to the first handler address the CPU then jumps execution to this address and the handler does this business to service the device when finished the handler is supposed to restore the CPU to its state before the interrupt for example if a handler uses a general purpose register if first copies the content to memory then copies that content back when finished so that the CPU can pick up where it left off in the course of execution the CPU may sometimes detect an aberrant condition that needs attention from the operating system these aberrant conditions are called Hardware exceptions and they're handled much like interrupts when the CPU detects an exception it jumps execution to a piece of operating system code called an exception handler by finding the address for the handler in the Hardware exception table for example division by zero is an illegal operation and so when a program performs a division instruction with zero as the denominator the CPU may trigger a hardware exception jumping execution to the handler for that type of exception by this mechanism the exception handler of the operating system can then decide what to do about the situation when a system powers up the first code that runs is the boot firmware the firmware code usually resides in a small IO device a memory chip which retains its content thanks to a small battery on the mainboard this device resides at a fixed address or port and when the CPU power is on it is hardwired to start executing code at that address older pcs had a boot firmware device called the BIOS short for basic input/output system but more recent pcs have replaced the BIOS with the newer standard UEFI the unified extensible firmware interface in either case the main task of the BIOS or UEFI is to jump execution to an operating system loader on one of the system drives most commonly a hard drive from there the operating system code is responsible for managing the system and providing an environment in which to run other programs we'll end our discussion of hardware by summarizing the key qualities of a CPU which concern programmers first off the program is CPU the coder must know the is a the set of instructions and registers the byte size in the system refers to how many bits make up each bite of addressable memory in practice this is not a real concern because all modern systems use 8-bit bytes but back in the 1970s and earlier some machines used sizes other than eight such as six or seven the industry settled upon eight bits per byte most likely because eight is convenient power of two and not too large the tournament word is used to denote the number of bits which is CPU can handle most naturally and efficiently the word size generally corresponds to the size of the general-purpose registers eg a processor with 32-bit registers early works most efficiently when copying data in 32-bit chunks some processors though actually have registers of varied size and so may have no single proper word size the address size refers to the number of bits used for each address effectively determining the size of the memory address space a processor that uses 32-bit addresses for example can address 2 to the 32nd which is over 4 billion unique addresses newer x86 processors typically use 48 bit addresses which allows for over 65,000 times as many addresses orders of magnitude more than the actual number of bytes in memory in today's systems in fact we may never need the full 48 bit address space even if we can produce memory chips with much larger capacities is not clear we would ever have the need for that much memory also if possible concern with the cpu are the speeds and sizes of its caches which a programmer may wish to keep in mind when attempting to heavily optimized code programmers also need to be mindful of how a system treats byte order when copying data between memory and the CPU registers getting this wrong can produce junk data and serious bugs when doing low-level code in that interfaces directly with hardware such as when writing device drivers and operating system a programmer would need to know how the devices are mapped whether the memory addresses or ports for most programming though such as when writing applications programmers can let the operating system and as drivers handle this concern lest the a CPU core is the part of the CPU which performs the core work of executing instructions each core executes instructions one by one but of our system has more than one core it can then execute multiple instructions simultaneously meaning that two programs can run simultaneously on 2 separate cores most pcs before 2005 had only one core but some pcs particularly those aimed at the server market had two or more processors each with its own core in 2005 Intel introduced the first multi-core x86 processor a processor with multiple cores contained in the same ceramic package the advantage of stuffin multiple cores into one single package rather than having multiple packages is that it's relatively cheaper and better for power management and system cooling most new consumer pcs today contain one processor with two or more cores while some expensive systems namely servers still use multiple separate processors those processors now typically have a or more cores each while it is possible for a single program to accelerate its performance by utilizing more than one core doing so effectively can be very tricky as we'll discuss later when we talk about concurrency you

Info

Channel: Brian Will

Views: 70,837

Rating: 4.9482203 out of 5

Keywords: computer hardware, programming

Id: 9-KUm9YpPm0

Channel Id: undefined

Length: 25min 34sec (1534 seconds)

Published: Thu Oct 10 2013