Welcome to whiteboard programming, where we
simplify programming with easy to understand whiteboard videos and today, I'll be sharing
with you the difference between data mart vs data warehouse vs database vs data lake... So, lets get started! 1. Database Well, a database is a traditional
method of storing data in tables, columns, and rows. This allows for easy data queries and processing. The databases are typically controlled by
database management systems (or short for DBMS), with relational database management
systems (or RDBMs) being the most popular subset. Businesses typically use databases for when
they need quick access to their data. For example, an airline might rely on a database
to process customers’ online ticket purchases. And an e-commerce company like Amazon might
use a database to track inventory levels and recommend products that the customer might
be interested in. And further, to ensure that the transactions
have integrity, databases need to have four components: 1. Atomicity, by which, we mean that either the
entire transaction takes place at once or doesn’t happen at all. 2. Consistency, by which, we mean that the integrity
constraints must be maintained so that the database is consistent before and after the
transaction. 3. Isolation, which ensures that multiple transactions
can occur concurrently without leading to the inconsistency of database state and, 4. Durability which ensures that once the transaction
has completed execution, the updates and modifications to the database are stored in and written
to disk and they persist even if a system failure occurs. And these updates now become permanent and
are stored in non-volatile memory. Therefore, the effects of the transaction,
are never lost. The databases that have these four features
are said to be ACID-compliant. Now that we have a good idea about how the
bulk of data has been stored in the internet age, let’s take a look at some newer storage
mechanisms starting with What is a Data Warehouse? So, a data warehouse is a central platform
for data storage that helps businesses collect and integrate data from various operational
sources. This data is put into reports, which are then
used for data analytics purposes and business intelligence efforts. In this light, data warehouses serve as the
backbone for mission-critical aspects of operations. Many of today’s leading corporations in
all sectors—including the airline, hospitality, healthcare, and retail industries—are using
data warehouses to streamline their data intake, reduce waste, and increase efficiency of their
generated data. In most cases, data warehouses store structured
data, typically from databases. To help you understand the concept better,
here are some additional benefits of data warehouses. 1. Data Integration A data warehouse enables
businesses to collect data from various external sources and then integrate that data into
one central storage platform. This makes it easier for data analytics teams
to analyze all data as there aren’t any silos. 2. Data History As the name suggests, data warehouses
can store data in a way that lets analysts see how data has changed over time. For example, Microsoft teams can determine
who created a file, who modified it, and when. 3. Better Data Quality A data warehouse enables
an organization to improve the quality of their data by shattering data silos. This enables organizations to unlock the full
power of their structured data and gain valuable insights. 4. Better Data Insights With more value on hand—and
less data (as all silos has been eradicated), the analytics team can make more sense of
their data infrastructure by collecting better and deeper insights. Further, with this information, they can then
figure out the best path forward to increase business intelligence & impact. Up next, Let's talk about 3. What is a Data Mart? Well, a data mart is a mechanism through which
business users, access data that lives in a data warehouse. Now, as the needs of every employee and each
team are different, data marts can be defined to showcase user specific data points. So, you can say that the data marts typically
help specific users or teams, not the entire workforce. Further, as a data warehouse typically includes
an entire enterprise’s data, on the contrary, a data mart is a more user-focused function. For example, an accountant might access financial
information related to customer transactions from a data warehouse through a data mart,
whereas, someone from marketing would be more inclined towards seeing what conversations
led to that customer buying from company, during the initial lead nurturing phases. Further, there are 3 different types of data
marts 1. Independent Data Mart Now, as the name suggests,
an independent data mart functions without relying on an existing data warehouse and
typically focuses on one specific business objective, rather than being broad in their
approach, here, the data is stored from either internal or external sources and can be called
upon when needed to perform data analysis and business intelligence. 2. Dependent Data Mart Unlike our previous type,
a dependent data mart lives on top of an existing data warehouse. In this arrangement, data lives in a centralized
location and when it’s time to run analytics, only the relevant data is accessed. 3. Hybrid Data Mart Now, a hybrid data mart integrates
data from external operational sources with an existing data warehouse. And the main benefits here include higher
speed, flexibility, and it’s capacity to handle large storage structures. Next, 4. What is a Data Lake? Well, a data lake is a data storage repository
the can store large quantities of both structured and unstructured data. A data lake functions similar to how its name
might suggest: All data, regardless of format, is stored as-it-is. For instance, imagine that each bit of your
businesses data is like a drop of water. These tiny drops of data flow freely from
various streams and rivers until they reach their final destination, i.e. your data lake. If seen from a perspective, together, this
data forms a quite large lake. Now, a major benefit to data lakes is that
they can store data without any prior processing. Here, the data simply flows into the lake
and then stays there, awaiting future requests from analysts and business users, to be used
in other business functions. Further, this free-flowing process means that
more data can be collected, stored, and retrieved than ever before. What’s more, since data lakes themselves
are unstructured, it’s much easier to access and modify the data which lies within it. Here are some additional benefits that data
lakes deliver to modern enterprises. 1. Unlimited Data Sources That is all thanks
to its free-flowing nature, data lakes can handle data from an unlimited amount of sources. 2. Storage for Raw and Unstructured Data All
thanks to a data lake’s flexible construction, as it can take in both structured and unstructured
data as opposed to most traditional data warehouses. 3. No More Data Silos Now, this is the most favorable
one... since data silos are removed from the equation, data lakes help organizations maximize
the potential of all of their data, including unstructured data. 4. Lower Costs Data lakes can save an organization
a considerable amount of money by eliminating the need for outdated legacy methods of data
storage. And also, data lakes are much easier for analysts
to use, which saves valuable working hours. With that, I hope this video was helpful to
you and served value, if you love my content, feel free to smash that like button and if
you haven't already subscribed to my channel, please do as it keeps me motivated and helps
me create more content like this for you.