While working at Control Data Corporation, computer designer Seymour Cray was once
asked his five and one year goals. He said: > Five-year goal: Build the
biggest computer in the world. > One-year goal: One-fifth of the above. Seymour Cray literally only wanted
one thing. To take it to the limit. In today's video, we look back at a genius’s
lifelong quest to make the biggest supercomputers. ## Beginnings Seymour Cray Jr. was born in the
small town of Chippewa Falls, Wisconsin in 1925, the son of a civil
engineer at the local power utility. The elder Cray fostered in his son a love for
science and engineering. The son adored chemistry, did electrician work for his junior
prom, and showed a talent for radio work. After serving in the army as an electrician, Cray went to university on the GI Bill
and graduated from the University of Minnesota with a bachelors in electrical
engineering and a masters in applied math. Then in 1950, he joined a small company called Engineering Research Associates
or ERA in St. Paul, Minnesota. ## ERA During World War II, the US Navy
employed a team of elite code-breakers. This division was known as the
Communications Supplementary Activity - Washington or CSAW. And
in secret, they designed and built powerful machines to break Axis naval
codes, mostly to find German submarines. At the War’s end, people realized that
the US Government still needed to break codes - Soviet codes this time - but with budget
cuts they cannot afford to keep their employees. So the Navy worked with an investment
banker named John Parker to set up a for-profit company that would hire and employ
the old team at higher salaries. This was ERA. ERA’s people worked largely in secret out of an old glider factory. Their first products
were code-breaking machines like before. But the team quickly began moving towards
useful general-purpose computers - leveraging emerging technologies like magnetic drum memory. One of the first such products
was the ERA 1101. The computer was a commercialized version of a device
sold to one of the precursors of the NSA. You can say that Minnesota in the 50s was
Silicon Valley before Silicon Valley existed. Thanks to the strong military presence,
it was America's premier computing center. ## Working With Cray Despite being just 25 years old, Cray
showed himself to be a talented worker. Cray was quiet, but had the unshakable
confidence to speak up when something was wrong. He had incredible focus and rigid
discipline. And of course, he was brilliant, with a gift for understanding
the world of binary numbers. Thus, Cray was challenged to produce the
control system for ERA's next computer - the ERA 1103. The control system breaks down the
software's instructions into execution steps. Control system design required front-to-back
knowledge of the computer's guts because it had to coordinate all those resources
to efficiently execute the program. Cray did the work, and quickly rose
up the ranks to supervise a team. Like another legend of the times, An Wang of Wang
Labs, Seymour Cray's genius can make him sometimes difficult to work for. Cray preferred to work
alone, spending late evenings in the workshop. If someone wasn't doing it right, or was taking
too long, he reassigned them and did it himself. They called it "being scrayed". But despite being very quiet, Seymour treated
his colleagues well and was generally liked and respected. Just a bit eccentric and enigmatic -
exactly as a genius computer designer should be. ## Rand In December 1951, president and
original investor John Parker sold ERA to the computer company Remington Rand. Parker didn't tell the employees at ERA about this
- which in bird culture is considered a dick move. And it angered key workers like William Norris, who joined early and rose
to become VP of Operations. Remington Rand was a typewriter and shaver
company. Why computers? Even more strangely, Rand had already bought a
computer company earlier: The Eckert–Mauchly Computer Corporation. J. Presper Eckert and John Mauchly
will forever be known for making ENIAC, the first programmable digital computer. And then EDVAC, pioneering the
stored program computer design that today underpins the Von Neumann architecture. Along the way, Eckert also created the first
course on electronic digital computers - the Moore School Lectures, which helped popularize the Von
Neumann architecture. No relation to Moore's Law. And then UNIVAC I, a computer that
garnered renown for predicting that some dude named Dwight Eisenhower would
win the 1952 US presidential election. The world-famous Eckert and Mauchly had
serious computer credentials so the guys at ERA accepted the arrangement.
Eckert–Mauchly would focus on the business computer marketplace while
ERA focused on scientific computing. ## Tensions
But tensions remained. Today, the computing needs of both
business and science can be handled by a general purpose computer. But in those
days, the computers had to be more specific. Computers for business did simple
operations - adding, subtracting, and multiplying and with only 2 decimal
points of accuracy. However this had to be done at large scale, thousands of
rows going in and out of the computer. Scientific computing was different in
that it required complex calculations with up to 20-30 decimal points
of accuracy. Such a computer might munch on a problem for hours just
to produce a single line of output. The guys at Eckert–Mauchly in
Philadelphia looked down on the Minnesota guys as mere "farmers" who did
not work on the state-of-the-art. And the Minnesota guys saw the Philadelphians
as theoreticians who only cared about making computers faster even if
it meant them breaking down a lot. Things got worse in 1955 when Rand merged with
the Sperry Corporation to become Sperry Rand. The merger forced together the two formerly
independent units into one Univac division. The tensions really flared between ERA,
Eckert, and their new corporate overlords. The aforementioned William Norris,
the division's new general manager, said to the Philadelphia guys, "You people
run a laboratory and ERA runs a business". When Sperry took over, its top management
including president Harry Vickers thought they were buying a market leader. That was
not the case. UNIVAC had the potential to be what IBM eventually became, but
it needed the capital to get there. Not only in R&D to build the machines, but
also because computers were an equipment leasing business back then.
It was very capital intensive. In 1957, enough was enough. Norris left
Sperry Rand to found a new company - Control Data Corporation, or CDC. a new company
was a huge risk, but Norris figured that if it did not work then he and his family
could go back to their farm in Nebraska. Norris invited a few of his ERA coworkers to come
with him. Cray was one of those who accepted, having seen the writing on the
wall when he noticed an accounting system categorize his project as
"999 Miscellaneous and Other". ## Control Data
CDC started in July 1957 with its employees, a vague idea to make computers,
and some money from friends. No plant, no product, and little money.
To get the company off the ground, CDC IPO'ed. Like literally, they stood
on the street and sold shares to ordinary members of the public for a dollar each.
This is probably not possible today. The company got its first splash of publicity
from an unexpected spot. Sid Hartman, a sports columnist at the Minneapolis Tribune
- and former part-time general manager of the Minneapolis Lakers (what?!) - mentioned the
move at the end of one of his sports column. Many prominent local investors
declined an investment in CDC, one saying that they "didn't have
a ghost of a chance" against IBM. Decliners included a guy named Warren Buffett.
Despite being William Norris's nephew by marriage, he declined due to a lack of understanding.
Bet he's poor now and deeply regrets that. Years later, the $1 shares would be worth about
many times that. The early CDC IPO enriched the company's 300 initial investors and created a
generation of new wealth in the Minnesota area. ## The 1604 Cray was the most technical person at the
new company, and persuaded his cofounders to build scientific computers rather
than going to the commercial market. His reasoning was that their clientele -
universities and nuclear weapons research labs - cared less about marketing and client
service. And they programmed their own software. What they wanted was compute. Serious
compute. Due to treaty obligations, you can't just test-fire a nuclear weapon and even discounting the environmental issues
cannot easily measure its workings. So to study a detonation, we needed
computers to simulate the bomb's chain reactions - stepping through all
the equations every micro-second after detonation. That means the biggest
and fastest computers on the market. Cray was convinced that he could build such
a computer at a relatively bearable cost using transistors. He went down to the local
electronics shop and found that they were selling reject bipolar transistors for radios for
cheaper than what you can get from the factory. These reject transistors sucked, outputting a weak signal. So Cray paired them up in
what is called a Darlington pair, with the second transistor amplifying the output
of the first. The experience taught him that, with the right design, you can use substandard
components and still achieve the goal. Over a year into the venture and with
money running low, Bill Norris strikes a deal with the US Navy for what would
be called the CDC 1604 computer in 1958. But now they have to build it at scale,
buying a factory and hiring more engineers. Norris and other managers cut their salaries
in half to save money. Engineers resorted to swiping transistor companies' free
sales samples for their computer. The CDC 1604 first hit the market in 1960, carrying a price tag of $990,000 or about
$10.4 million today. At 0.2 megahertz, it was the most powerful commercially-available
computer of the time. A supercomputer. ## Supercomputers
A supercomputer is a bit of a squishy term. It is about pushing the envelope in computing, bringing out a computer that leads
all others in its field. Control Data was not alone in the market of producing
super-fast computers for niche customers. UNIVAC released the Livermore Automatic
Reaction Calculator LARC in 1960 - the same year as the 1604's release. It helped
Edward Teller do simulations for the hydrogen bomb and was the most powerful
computer in the world from 1960 to 1961. The LARC scared IBM so much
that they built the IBM 7030 STRETCH supercomputer - designed by the
legendary Gene Amdahl. The 7030 took back the crown of the world's most powerful
supercomputer, and retained it until 1964. But the LARC and STRETCH
were basically made-to-order products. UNIVAC only made 2 LARC units. Control Data turned the supercomputer into
a category - a commercially successful one, at that. The 1604 sold to the University of
Illinois, Lockheed, the State of Israel, and more. The company began to turn a profit, challenging
the old computer giants like Rand and IBM. CDC stock went from $1 to $9. Now Norris had to keep
the engineers from selling their stock too early. ## A New Hope After the 1604, Cray and CDC
debated about how to proceed: Follow up on the 1604 and finally attack the
lucrative business data processing market? This would mean iterating on the 1604 architecture
- making smaller computers like the CDC-160A, a very good and successful
control applications computer. But Cray only wanted to build
the fastest possible machine. The scientific community had only started
to realize a new form of computational modeling - Finite Element Analysis,
which I mentioned in a prior video. Finite Element Analysis involves splitting
something down into millions of simpler elements and running simulations based on
how those elements might act. For instance, breaking a car down into tiny shapes and using
that to predict how it might survive a car crash. With Finite Element Analysis, the more steps you
can break something down into, the better you can model and predict complex systems like the weather
or nuclear explosions. This basically implied an infinite need for compute. Cray wanted to
be the guy to feed that need for speed. He threatened to leave the company over this
issue, which caused Norris to eventually agree on splitting the two teams. One CDC
team would work on an 1604 followup. Meanwhile, the 35-year old Cray and his team
were allowed to open their own lab in Cray's hometown of Chippewa Falls, Wisconsin. This new
lab was just a brief stroll from his house. There Cray and his team worked on a machine some 15
times faster than the 1604, named the 6600. ## The CDC 6600 When Cray and his team sat
down to make the CDC 6600, they started off with something like the 1604. The 1604 was built with Ferrite magnetic cores for main memory and magnetic tape for
secondary storage. For compute, they had germanium transistors. Everything was
built inside air-cooled pluggable building blocks. But as they worked on the 6600, Cray
changed many things. Critically, he sourced silicon planar transistors
from Fairchild. They switched far faster than germanium transistors,
automatically granting a 5x speed boost. The rest of the 10-15x speed up goal though
had to come from somewhere else. Cray soured on the building-block approach. Each block had
extensive back panel wiring, which not only caused noise issues, but also limited input/output
and increased how long it took to transmit data. So Cray threw it all out and switched to
using denser, more complex custom modules called "cordwood modules". The shorter wires
improved speed, but also made it difficult to repair and necessitated the replacement
of air-cooling with freon gas cooling. Another concept they implemented
was parallelism. Every system has to do housekeeping functions or the
such in addition to the main compute task. Why should the "main processor" have
to do that? Offload it to something else. The 6600 contained 11 individual computers
that can execute programs separately from each other - they only shared a central memory.
Ten of the computers handled secondary work like peripherals, leaving the eleventh computer
free to do nothing but high speed math. Additionally, the 6600 lived and breathed
simplicity. A computer CPU uses something called an instruction set architecture
or ISA to define its basic operations, thus also defining how software can control it. The 6600 simplified its ISA, ditching everything
unrelated to scientific computing. Like for instance, instructions for handling large amounts
of data, something more geared for commercial users. This simplified instruction set allowed
the computer to "pipeline" tasks, breaking down a bigger job to smaller ones that can be assigned
to peripheral computers to work on simultaneously. CDC delivered its first 6600
to Lawrence Livermore in 1964. The machine's incredible speed - three
times faster than the 7030 - shocked IBM. Chairman Thomas J. Watson Jr.
wrote a scathing memo asking how "34 people - including the janitor" beat the
biggest technology company in the world. The answer of course was that IBM's architects could
not bear to sacrifice compatibility for speed. ## Discontent The splitting of the teams within
Control Data prevented a blowup, but discontent continued to fester. Norris and other managers continued to build
up the business. CDC began making its own peripherals and software to accompany the
main computer, building a services business. Control Data also purchased a consumer finance
company - Commercial Credit - intending to use their $3.4 billion of working capital to fund its
computer leasing strategy. This strategy - which seemed smart at the time - eventually backfired
when Commercial Credit ran into difficulties. Over a hundred CDC 6600s were sold
to big customers like the Atomic Energy Commission. But each cost $8 million or
about $23 million today. As you might think, it limited the market to about
50 total customers in the world. But Seymour Cray felt this was a feature,
not a bug. He loved knowing the first names of each of his customers. Yet Control
Data's management was increasingly coming to the belief that peripherals and services,
not hardware, were the company's future. Control Data followed up the CDC 6600 with
the 7600. The 7600 was hailed as the world's fastest computer, five times faster than the
6600. But despite costing only twice as much of its predecessor, it sold poorly in part
due to frequent breakdowns and a weak economy. Then after that, we had the 8600. This
supercomputer was made with regular discrete transistors. But Cray wanted
a clock cycle time of 8 nanoseconds, which meant every wire had to be shorter than 2.5
meters, squeezing those parts very close together. Things got so dense that Cray couldn’t figure out
how to sufficiently cool them. After many months, he decided to throw everything out and start again
from scratch. It was his style - the "Cray way". But this was 1971, and Control Data was
in the midst of an expensive antitrust lawsuit against IBM. Cash flow was running
low. Cray was asked to cut expenses by 10%. Unwilling to do that, he cut his own
salary to minimum wage, or $1.25 an hour. This did not solve the issue. In the
end, Norris told Cray that a redo like before could not be done - they already
pre-sold two 8600 systems. And in 1972, Cray decided to leave and start
his own shop - Cray Research. ## Cray Research & the Cray-1
Seymour Cray founded Cray Research with $2.5 million - 20% of which was his own
money - and a bunch of bank loans. The company's goal was to build the biggest
computer, one at a time like a master artisan. It did not care for big revenues, nor did
it expect them. The focus was on research rather than manufacture. Like Star Trek,
plumbing the outer limits of possibility. In a show of goodwill, Norris and Control
Data arranged a luncheon to say goodbye and invested a quarter million dollars in the
new company. Norris called it "heart money". For his first computer, the Cray-1, Seymour Cray
wanted revolutionary performance. To get it, he decided to turn to a new
concept: Vector processing. Most CPUs of the time used scalar
processing, meaning they process single data items like integers or
floating point numbers one at a time. So imagine the job of adding 1 and 1. A
scalar CPU would load the first 1 into its register from memory, load the second 1, add
them, and then store the result into memory. Count it up. This job used 4 instructions. So if we are summing up 2 sets of twenty numbers, that is 80 instructions that
a scalar CPU has to handle. A vector processing machine
shortcuts that by processing single-dimension arrays of data: Vectors. So if we have those two vectors
containing 20 numbers each, loading the two vectors into
the register, adding them, and storing the results vector into memory.
That only uses 4 instructions rather than 80. Control Data knew about vector processing
too. They had a small team working on a vector computer called the STAR-100. But the
STAR was tremendously complicated and failed to live up to its promises. Control Data shipped
it four years late and sold only three of them. Cray studied the STAR-100 and realized its
flaws. First, its scalar processing was slow, bottlenecking the overall system performance. And second, the computer's vector processing
implementation had a hitch. Recall my example from before. Before you can run the addition
operation on the two vectors, you first have to load them both into the register. Same
with sending the results back to the memory. The problem was that this was taking too long.
Does it matter how much faster vector processing is compared to scalar processing if
handling the vectors took forever? So Cray introduced "vector registers", very fast intermediate memory systems that
worked like cache memory to improve speed. Seymour also decided to adopt integrated circuits
for the first time. This allowed for more density and cut down on wiring, allowing the Cray-1
to be far smaller than its predecessors. By then, ICs were roughly about 14 years old and
quite mature, but it reflects Cray's approach of choosing older technologies - "a decade behind" as
he liked to say - so that they are more reliable. But when it came to memory, Cray could
not compromise. He broke his principle and bought bipolar semiconductor memory chips
to replace the old core memory. It cost less, had more density, and ate less power. The computer's clock cycle of 12.5 nanoseconds
made it five times faster than the CDC 7600. So every wire in the machine had
to be less than four feet long. And of course, you can't forget its iconic look. A circular shape to accommodate
the new cooling scheme. But with an added bit of flair to differentiate
from the boring gray boxes of the era. And it had cushions too. ## The Cray-1's Stir The Cray-1 made a huge stir upon its release in
1976 with its flashy look and world-beating speed. A hundred Cray-1s were sold to various
government and university lab customers like the National Center for Atmospheric
Research and the Department of Defense. It generated 150% revenue growth
for Cray Research from 1978 to 1979, with another 50% growth a year after that.
The orders came in so fast - one a month, which is a lot for an $8 million
product - that a big backlog developed. IBM did not even try to compete.
Cray's former employer Control Data, found itself thrown off its feet. They
tried - producing vector computers like the Cyber-205 - but they had gotten
bloated and complicated, unable to keep up. The company suffered large financial losses and eventually sold itself off
in pieces. By the mid-1980s, CDC's most profitable business was Ticketron
- a rival to the widely despised Ticketmaster. ## The Fruits of Success As I said, Seymour Cray wanted
to build the fastest computer, and to build it from a "clean piece of paper". Even as the Cray-1 was in the
late stages of development, Seymour started to shift his gaze towards
a machine even more ambitious: The Cray-2, with a clock speed some three to six
times faster than its predecessor. Such a machine had obstacles. With that clock
speed, no wire can be longer than 40 centimeters, again bringing back the same heating
challenges Cray faced with the 8600. To his dismay, Seymour could not focus
on solving these problems because his business needed him. To fund early development, Cray Research IPO'ed its stock, which brought
on a whole new load of responsibilities. And a bit poetically, the company's success
caused new headaches. Since each Cray-1 was hand-wired and custom-made like some
limited edition supercar - a process that took a year - the company had no choice
but to staff up to deliver on its big backlog. From 1978 to 1980, the company grew from
300 to 500, rapid growth. At its peak, Cray Research employed over 5,000 people in
the tiny town of Chippewa Falls, Wisconsin. ## The Same Dilemma Seymour originally pursued scientific
computing because users wrote their own software. It let him just focus on hardware. But times had changed. Customers no longer
had the budget to rewrite their software every time. They wanted portability - it is why
the Unix operating system got to be so popular. So Cray's customers were increasingly
interested in getting a better Cray-1 than a radically different Cray-2
which would require them to redo all their software. It was the story of
Control Data and the 1604 over again! Eventually, Cray Research's management,
including CEO John Rollwagen, did the dual approach once more. On one side,
they extended the Cray-1 line with the 1S - still a very powerful computer but not
radically different like a Cray-2 might be. Meanwhile, Seymour Cray stepped down
as Chairman in 1981, handing that job over to the CEO Rollwagen, and became an
independent contractor so that he could work on the Cray-2. He moved to Boulder,
Colorado to work in peace once more. ## The Cray X-MP The Cray-2 eventually did come out in 1985,
after three false starts on the cooling system. Famously, it had this massive liquid immersion
cooling system that caused the machine to resemble an aquarium. Even so, memory latency caused
the system to underperform its full potential. Then to the surprise of many, Seymour's
Cray-2 found itself upstaged by another computer produced by a separate
team: The Cray X-MP supercomputer. The X-MP team was led by a
long-time Cray collaborator named Les Davis as well as a talented young
Taiwanese-American designer named Steve Chen. Where the Cray-1 had a single CPU, the
X-MP introduced parallel processing with four CPUs along with new solid state
storage semiconductors. Released in 1983, the X-MP was the world's fastest
supercomputer - 2-5 times faster than the 1S - without the radical design
changes of the Cray-2. People were stunned. It sold very well compared to the Cray-2. By
1989, there were only 24 units of the Cray-2 sold as compared to the almost 200 units sold of
the X-MP and its immediate successor the Y-MP. ## New Pressures There were other changes. In the 1970s, Cray had
no competitors for its unique part of the market. But throughout the 1980s, new supercomputer
competitors started to emerge out of the woodwork. First over in Japan, where
Fujitsu, Hitachi and NEC leveraged Japan's growing advantages in VLSI semiconductor
production to make compelling supercomputers. Cray still dominated the market. In
1988, they had 56% market share of traditional supercomputers, but the Japanese
altogether had 37% and were making ground. In a similar vein, supercomputer
startups like Thinking Machines and nCUBE began exploring new approaches
of supercomputing beyond just vector computing. The most prominent of which
are Massively Parallel Systems or MPPs. These systems coordinate many commercial
microprocessors to do millions or even billions of floating point operations
each second. These microprocessors being bought off the shelf meant
far better price for performance. This combined competition from the startups, the Japanese, and even old friends like
Control Data's supercomputer spinoff ETA Systems put a lot of pressure on Cray to
focus its approach and its product lineup. ## Cray Leaves Cray Cray Research's unexpected success with
the X-MP would come back to haunt it. Designer Steve Chen was featured as one of the
company's young exciting rising talents - the next Seymour Cray, even. But unfortunately, the
aggressive vision that Chen and his team had for the MP line after the Y-MP spiraled beyond
what the company can financially support. 64 processors, custom integrated circuits,
and maybe even optical interconnects. In 1987, the company was already invested
in developing Seymour Cray’s next machine the Cray-3 and the Y-MP
in addition to three existing products that needed money too. It had no
money for Chen’s science fiction dream. So that year, Cray Research scaled back
the MP line of computers. In response, the much heralded Steve Chen quit Cray Research and
started his own company - Supercomputer Systems. Supercomputer Systems took $150
million of investment money from IBM and other investors like Ford and
Boeing, but went bankrupt in 1993. In 1989, the company could no longer accommodate
its former founder. Seymour Cray joined a spinoff called Cray Computer Corporation or CCC in
Colorado to work on the future Cray-3. Cray Research thusly would go onwards with the existing
Cray X-MP architecture, building the ecosystem. The Cray-3 would have used gallium arsenide
semiconductors for switching performance far faster than what was possible with silicon. It
necessitated buying a lot of chipmaking equipment. The Cray-3 soon fell behind, and in 1991,
various customers started cancelling their orders because of cratering defense
demand after the fall of the Soviet Union. CCC could only sell one system before filing
for bankruptcy in 1995. An announced Cray-4 never materialized either. Seymour Cray's
cherished approach of "sitting down with a clean sheet of paper" and building a "big iron"
supercomputer was no longer financially viable. Seymour Cray formed a new company, SRC
Computers, to begin exploring parallel designs. But he then passed away in 1996 at the
age of 71 due to injuries from a car accident. Cray Research was eventually sold to Silicon
Graphics. It bounced around for a while, but is now part of Hewlett Packard
Enterprise as just Cray Inc. ## Conclusion Advancing semiconductor technologies have made the
supercomputers of the past seem comically behind. In 2010, an electrical engineer
named Chris Fenton did a project to emulate the Cray-1A supercomputer using
a Xilinx Spartan-3E 1600 development board. He even put it into a cute little Cray-1 package. It now sits in the Computer History
Museum, and it inspired this video. Today’s leading edge semiconductor makers
now face the same issues as Cray did with his supercomputers. Thermal problems.
Interconnect problems. Slowdowns due to memory retrieval. I am struck by the similarities. Unfortunately, the semiconductor industry
cannot do as Seymour did. Throw it all out, and start anew with a fresh sheet of paper.