Is Big Data Getting Too Big?

Video Statistics and Information

Video

Captions Word Cloud

Captions

[MUSIC] "This episode is brought to you by Lynda.com" [MUSiC] We all use technology but do we understand how it works? Consider a grain of rice. It's not much of a bite, but imagine if it was a byte. Our computers deal in a language of 1s and 0s to store their information and compute their instructions. Eight of these 1 and 0 bits make a byte, and a thousand of those make a kilobyte, and a thousand of those make a megabyte. But today we commonly deal in a currency even larger, like gigabytes and terabytes. Although we don’t even understand how astronomically huge they really are. But consider this: a terabyte of seconds is 32,000 years! We’re quickly moving far beyond these scales to even larger ones, like petabytes, just one of which could cover the entire island of Manhattan. Did you know that all the videos on YouTube are about 500 petabytes? Whereas companies like Google store up to 10 exabytes of data. We are saving so much, in fact, that it’s becoming a serious challenge to even deal with. [MUSIC] It’s estimated as many books were printed in the first 50 years after the Gutenberg printing press as scribes had written in all the previous 1,200. Today, on the other hand, we double our store of information every 2 or 3 years. Consider that in 2007, all the data we had ever saved–and I mean everything– was estimated at 300 exabytes. By 2013 that number had grown to 1,200 exabytes. The total amount of data on Earth, since the dawn of civilization, quadrupled in just six years. This acceleration, will no doubt continue to accelerate. The radio telescopes that make up the Square Kilometer Array will generate an exabyte of astronomical data every four days. In several years we’ll probably live in a world that deals in zettabytes, one of which, on our rice scale, would fill the Pacific Ocean. This isn’t about Moore’s Law, the exponential growth of computing hardware power, it’s about data: BIG data. While much of it will be useless, or at least hard to organize, this quantity of data will lead to changes in the quality of how we live and understand our world, probably in ways that we can’t imagine. Consider this cave painting. Its creation was pretty slow and it contains a limited amount of information. But consider this photograph. It was faster to make and contains much more detail. But once we could capture that horse’s motion, our observations became even more meaningful, but they also consisted of more data. At first, only 12 or 24 images. But as we increase our resolution, dividing that experience into smaller amounts of time and detail, extracting more information, suddenly we create a ton of data to deal with. But maybe big data is in our DNA. Literally. Sequencing the first human genome, reading just about every letter, cost roughly $3 billion dollars in 2001. Today, the same sequencing only costs about $1000, cheap enough that it might soon be cheaper to sequence a genome than to store one on a hard drive, tape drive, or magnetic storage device. Beyond the challenges of processing and analyzing all this data, which are huge, we have a more practical problem. Where are we gonna keep it all? 100 years from now, it’s estimated we’ll be storing 42 yottabytes of data every year. Using technology that companies like the Google use today, we’d need enough data centers to cover the surface area of 12 Jupiters. But DNA itself might hold the answer. Harvard researchers have been able to write entire books to DNA, and the molecule has the potential to hold petabytes of data in just a few grams of genetic material. That doesn’t explain how we’ll read it, write it, or even store it, but to deal with the coming data deluge, we will need something new. It might give a whole new definition to… “saving the world.” Stay curious.

Info

Channel: It's Okay To Be Smart

Views: 574,331

Rating: 4.9514184 out of 5

Keywords: science, pbs digital studios, pbs, joe hanson, it's okay to be smart, its okay to be smart, it's ok to be smart, its ok to be smart, computers, big data, information, 80's, 1980's, gigabyte, terabyte, petabyte, exabyte, yottabyte, data center, vsauce, scishow, asapscience, smarter every day, veritasium, data, storage, data storage, byte, computer science, information science, itsokaytobesmart

Id: NTMkc0bLRlI

Channel Id: undefined

Length: 5min 43sec (343 seconds)

Published: Tue Feb 16 2016

Reddit Comments

So PBS is making Tim and Eric videos.

👍︎︎ 3 👤︎︎ u/[deleted] 📅︎︎ Feb 16 2016 🗫︎ replies

A E S T H E T I C

👍︎︎ 3 👤︎︎ u/dakane328 📅︎︎ Feb 16 2016 🗫︎ replies