What is Snowflake? 8 Minute Demo | Snowflake Inc.

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] thanks for taking a look at snowflake snowflake helps organizations collect more data than ever before our Cloud Data Platform is capable of running any workload enables secure and govern access to all data and delivers the performance and scalability modern enterprises need all in an easy-to-use platform delivered as a service this demo looks at snowflake through the eyes of the analytics team at city bike the bike share system in NYC let's take a look at our data engineer Sam who needs to get the Ryder data loaded into snowflake as quickly as possible this data is key for the city bike team to be able to understand what's influencing riders and how city bike can serve them best the data is currently in CSV format and it includes useful information such as when where and how people are using bikes we'll see how a data engineering and analytics team can both leverage snowflake to drive meaningful insights we look at how a data engineer can easily load data into snowflake then see how a rider engagement analyst can do quick analysis and even use semi structured data for deeper insight the CSV is sitting in an Amazon s3 bucket to load it into snowflake Sam will need a table to load it into and compute resource or warehouse in snowflake parlance to load it with since this file is just a couple million rows Sam uses the smallest warehouse and snowflake which will cost a mere $1 for the hour it will be used after which it will Auto suspend so Sam does not incur any additional cost for this workload now that the database and warehouse have been created he can load the CSV data from the s3 bucket using snowflakes built-in worksheets functionality Sam uses standard ansi SQL statements to create a table for the bike data and loaded up snowflake automatically handles all indexing and partitioning saving Sam a lot of time and effort once the data is loaded it's ready for use let's see how kristina can use snowflake for analysis first she logs into snowflake and sees the warehouse Sam is using for performance she wants to create her own to compute instance and access the data independently of Sam snowflake uniquely separates compute on shared data so multiple workloads can run in parallel without resource contention each compute instance only uses the amount of compute required for the workload this means both Sam and Christina can select just the right amount of compute they need for their workloads so citibike pays only for what it uses using the data source Sam just loaded kristina hopes to learn how citibike can get more writers and investigate any meaningful drops in ridership as well kristina uses standard SQL statements to do some exploration and sees the lowest days the days with the fewest number of rides are all clustered in february she wants to see if the average duration of trips is also relevant and sees that February 15th was among the days with the shortest trips on average clearly something was going on during this period of time she wonders if weather was a factor and decides to ask Sam to collect some weather data so she can do some more analysis at Christina's request Sam sets out to load some additional weather data to help with her analysis Sam is able to find some JSON data that contains weather from around the world which he thinks will help Christina even though this data is semi structured snowflake is able to load it in its native format without any effort Sam loads this data into snowflake with the exact same method and SQL tools he used before he decides to flatten some of the data into columns and store the nested data that remains in snowflakes unique variant column type where he will be able to transform it down the road or use it natively if he has a need snowflake makes it easy to store either way for the utmost flexibility now that he has the table it's time for Sam to load the data from the s3 stage into the table using the load warehouse he created before once this is done Sam's job is over and Christina can get on with analysis even though the load Sam started was not finished Christina can already start doing some analysis without having to wait she decides to look at the most popular days of the week Wednesday and Thursday to see if a lot of people use city bikes those days to commute Christina discovers that a core group of customers was largely born in the 80s so she can see that the main customer base is professionals in their late 20s or early 30s next Christina checks to see if the weatherdata Sam is loading has completed by checking the history tab which shows everything that's going on in the database she sees that the data was loading at the same time she was running her analytics snowflakes Multi cluster shared data architecture allows data to be loaded at the same time it's being analyzed so Christina didn't have to wait like she would with other platforms the weather dataset covers the entire world so is relatively large so Christina filters down to just NYC since snowflake simplifies the process of working with semi structured data she can transform the json data using standard SQL and break out of the weather types so she can see if there was inclement weather on the days she's interested in this view also filters down to New York to finish her analysis Christina now needs to combine the weather data loaded from her JSON files with the trip data loaded from the CSV formatted files earlier after combining data she can see what the average trip duration was during different types of weather it's clear that the snow had a role to play as did fog but what Christina really wants to know is if those days in February had bad weather a closer look shows that it was a significant snowstorm that drove the lower ride numbers using one of snowflakes many ecosystem partners Christina could obviously do more robust visual analytics however we can see that snowflake has allowed her to complete a thorough initial assessment she was able to determine that the core city bike customer is in their 20s and 30s that they read city bike to commute to work and that the customer base is heavily influenced by the weather with analysis done christina quickly confirms that the compute resources are on auto suspend so the city bike team isn't paying for resources they aren't actually using christina also sets her dedicated warehouse to auto resume so it will turn back on automatically the next time she needs to do analysis saving her time and eliminating remedial database administration tasks in a few short minutes we've shown how easy it is to load both structured and semi-structured data into snowflake with the use of standard SQL we also saw how it's possible to run simultaneous workloads on the same data without queuing or delays enabling a team of data engineers and business analysts to collaborate to quickly discover new insights about their business visit trial that snowflake com to get started with snowflake for free today [Music]
Info
Channel: Snowflake Inc.
Views: 328,595
Rating: undefined out of 5
Keywords: Snowflake, Snowflake data warehouse, Snowflake computing, Snowflake company, Snowflake database, Snowflake cloud, Snowflake IPO, Data warehouse, Business software, cloud computing, digital transformation analytics, digital transformation data science, data driven digital transformation, big data and cloud computing, internal vs external data, big data ecosystem architecture, data ecosystem, digitization of data, business digital transformation
Id: xojAXXRo_S0
Channel Id: undefined
Length: 7min 36sec (456 seconds)
Published: Fri May 08 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.