Best Data Science Books for Beginners πŸ“š

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
a question often get asked on the channel is which books to read when you start learning data science so in today's video I'll talk about some of the best books I know for data science beginners a data science process often involves a lot of activities from data collection data cleaning and exploration to data analysis and visualization and storytelling to building and productionalized models for this video I mostly focus on books that help you develop the skills for these activities over here this video is sponsored by data Camp mourn them later in general doing data science is kind of like building houses you're gonna need some tools like hammer and nails these are similar to tools like python or r or SQL or Excel but just knowing the tools is not good enough you also need to know how to actually build and design a house using these tools this includes the core knowledge such as business knowledge math statistics modeling and design most of us might find ourselves being better at the tools then at the other things or vice versa I think a good book would ideally cover both theories and tools but more often than not they are dedicated to only one or the other and this is totally fine you just need to be aware of that and know how to complement learning from different books so for each book mentioned this video I'll read them using three criteria firstly depth does the book cover the topic in depth second readability is it easy to read and finally practicality can the knowledge from the book be easily applied in practice without further Ado let's Dive Right In the first book I'd start with is python for data analysis this book is mostly about pandas numpy mud plot lip and additionally psychic learn and stats model I have it on my iPad because I want to be able to copy paste the code and check out the web links from here you might have been familiar with these libraries in my case I learned a ton just from Googling and using Snippets copper together from dozens of stack overflow threats however with this book it's nice to have the material in a clear thoughtful and organized way so it can be a very good reference that you can use in conjunction with an actual python course you're taking and help you fill in some gaps you may find in the courses there are 13 chapters in this book covering from basic python setup basic data structures and syntax and data cleaning and transformation with pandas and numpy it also has delicate chapters for plotting and visualization time series and modeling this book has a lot of good examples for a lot of different types of data wrangling and Analysis and explains many common gutters for new python users the book doesn't cover more advanced topics such as parallelization object-oriented programming so overall I'd give this book 4 stars for depth 5 stars for readability and 5 stars for practicality if you're learning R for data science instead an equivalent introductory level book I found is an online book called short introduction to data science by Ron sarafian Link in the description is beginner friendly but dangerous enough to teach you really useful stuff actually it's the only r book I found that uses data.table which is a library in art that improves the performance and simplifies a lot of the clunky base R data frame syntax and honestly I love our data table even more than pandas I wrote a whole article a while back about it but please don't hit me for this talking about learning Python and R data Camp is running a week-long promotion to commemorate World space week so between October 3rd and October 9th Learners can pay just one dollar for a month's worth of unlimited content of datacam's entire learning platform you can take any courses you like on the platform from python R to data analysis with power bi after a month if you enjoy the courses and would like to continue the subscription you pay full price per month going forward compared to cost zero datacam doesn't provide certificates but it lends itself to the more practical approach to teaching you can code directly on the platform and follow along with the lessons which is perfect for beginners who don't want to have to install the software yourself you're basically taken by the hand to learn stuff if you're interested check out the link below thanks to datacam for sponsoring this video moving on to statistics books I think practical statistics for data scientists is the most useful and beginner friendly one it covers all the Core Concepts that you need to know for data science including descriptive statistics sampling distributions hypothesis testing and a b testing and prediction and unsupervised learning actually for one of my earlier video about statistics for data analysis I draw a lot of inspiration and the key content items from the first two chapters in this book each statistics concept is delivered very concisely in bite-sized sections with the key ideas and key terms you need to know in a very approachable way each concept also comes with the code Snippets for both R and python so it's quite handy to test out the code yourself and see how things work this will help you to quickly connect the theoretical concepts with the Practical use if you want to read further on a topic there's also a section for further reading this is really one of the most practical and approachable books about statistics for data science I've found although some topics could be discussed further in depth so I'd rate this book with 4 stars for depth 4.5 for readability and 5 for practicality a super casual and fun book for statistics 101 is naked statistics it explains the fundamentals of Statistics in a way that's very easy to understand for Layman the author talks about key Concepts such as probability inference correlation and regression analysis and how they applied in all areas in our lives like sports politics and business at the same time the book also points out many many common mistakes we make from big too small when it comes to statistics and modeling one of my favorite chapters is don't buy the extended guarantee on your 99 printer I must laughed in tears reading this chapter because it reminds me of the first lesson I got about personal finance so the story is five or six years ago when I got my first job and maybe the first paycheck I went on to get myself a new iPhone at the store for like five hundred dollars at the checkout the sales assistant asked me if I want to buy a phone insurance in case of theft or if the phone breaks down somehow I thought it made a lot of sense buying insurance for my beautiful new phone then I went on to buy the one year insurance plan that cost seven dollars a month I came home that day after the shopping frenzy I sat down and realized oh no it was such a terribly stupid idea who on Earth would pay 80 dollars of insurance a year for 500 phone when the chance of having my phone stolen is one in thousands now no wonder the sales guy was so sweet to me so Lessons Learned From the chapter is that you should only buy insurance for things you can't afford to withstand if something bad happens like your health but other things like your 99 printer and other appliances they are the last things you want to buy insurance for this is how insurance industry uses probability to make money a PD I must say is that this book doesn't cover Bayesian statistics but it's overall very good entertaining book so I'd rated 4 stars for depth 5 for readability and five for practicality regarding math books math I think is mostly required for machine learning for common data analysis you generally need very limited amount of calculus or linear algebra for learning essential math concepts I wouldn't recommend reading books as starters most math for data science books I found are pretty hard to read for beginners and even straight out boring so I'd actually recommend you first taking an online course to have the basic of calculus and linear algebra under your belt in combination with that you could use the interactive books such as immersive math and explained visually I've linked to the websites in the video description below I actually only use math books mostly to look up for the formal definition of something when you feel ready and comfortable with fundamentals there are some machine learning books that helps you learn pretty well with the math behind the algorithms for truly gentle introduction to machine learning I'd recommend machine learning simplified it's a new book that was published earlier this year that covers the fundamentals of machine learning it's a beautiful book and I must say I'm pretty impressed at how much ground this book covers and how accessible the topics are presented you can read the first few chapters for free on this website or get the ebook for Just Two Dollars on Amazon which is totally insane the second part on unsupervised learning is still coming I think so I think this is an an excellent book whether you're new to data science or need a refresher because it's very easy to understand and realize more on the intuitive examples to explain the concepts instead of only showing the mathematical formulas so far I think this is one of the best introductory machine learning books I found that gives you the exact portion of knowledge you need to know as a beginner so you don't feel overwhelmed the book also has a repository that contains the actual bison implementations of all the topics discussed in the book it's a great book very inexpensive I love it I'd give this book 4 stars for depth 4.5 for readability and 5 for practicality the next book on machine learning that's loved by many people is machine learning with psychic learn and tensorflow I've got the first edition and the second edition looks a bit like this this is I think the best book of its kind for data scientists and machine learning practitioners the book has two parts the fundamentals of machine learning and part 2 is about about neural networks and deep learning the book first starts with an overview of different machine Learning Systems the common challenges in machine learning and then it goes on to data transformation and how to train models and all the different techniques terminologies and metrics you need to know you can find all the most common supervised machine learning algorithms here part 2 covers building deep learning models with tensorflow and at the end we also have a small chapter on reinforcement learning using openai gym exactly as the name suggests this is a very Hands-On book 4X machine learning Concepts or algorithm that the book introduces there's a page or half a page explanation of what the concept is about in a nutshell then the book Dives straight into the implementation of the algorithm with psychic learn in Python I really love the tone and the language used in the book there's really no jargons and even difficult math formulas are used very sparingly the book lands itself to a more intuitive approach with simple English also the data example pictures and diagrams to help you gain a high overt understanding of the concepts however I feel like there should be more discussion on comparing and contrasting different algorithms and example cases in which an algorithm might be more preferred than others or which algorithms are being used in practice by companies in Industries so I think I'll give this book 3.5 stars for depth 4.5 for readability and 5 for practicality there's often a big gap between developing machine learning models in Jupiter notebooks with actually deploying and scaling that model in production this is also where a lot of companies and organizations struggle so if you're interested in having an overview of end-to-end machine learning or want to become a machine learning engineer a great book for this is designing machine Learning System it's a very new book released just a few months ago at the time of making this video written by chiprun a machine learning researcher who happens to be Vietnamese this book touches upon many important topics around productionalizing a machine Learning System a model often looks great on a toy data set into the notebooks but in various projects at work I have encountered many challenging issues a very common problem is for example class imbalance when you're predicting financial fraud and only 0.02 percent of clients in the data actually committed fraud other times you may not have enough labels to train a model or you're not sure how often you need to retrain a model in production all these topics are discussed in this book in addition to many other topics such as data leakage problem which is probably one of the worst things in machine learning and yet surprisingly common also how to engineer good features how to monitor shifts in the data the book gives a lot of good examples and derives a lot from the author's experience in Industry I think it so far has been the best in this topic but some of the claims or argument in the book maybe it could be better articulated or backed up with references just so that for people who don't have a lot of experience or are not experts in the field can also follow the discussion so I read this book four stars for depth 4.5 for readability and 5 for product Gallery as we can talk about data science without talking about data visualization the next two books will be about database the first book I recommend for any data notes is the book storytelling with data many of you asked me how to design dashboards and charts I'll say read this book first and you have a pretty good idea of what's good to do and what's bad to do there's so many basic principles covered in this book and tons of tips and tricks to think like a designer putting yourself in the shoes of the audience and catch their attention with your story I also enjoy the 5 case studies at the end of the book to show you different ways to improve a visualization and tell a better story so if you ever do data visualization I believe this is the first book you want to read the only thing I'd love to see more is the advanced or non-traditional types of visualization that could also be very powerful for storytelling also all the examples in this book are static charts so I feel like the interaction design discussion is a bit liking in this book so overall I'd say this book is a four stars for depth 5 stars for readability and four for practicality Luke barus has a great video where he makes an in-depth review of this book so be sure to check it out something I'd say I do almost as a hobbyist is data visualization on the web power bi and Tableau are great for creating charts quickly but for me it's still more fun to do web visualizations they're more customizable much more performance because they runs on JavaScript so everything runs very fast and smooth and I can totally break free from the limitations that out of the box software has to offer to create something a bit more pretty and unique so if you want to go into this Rabbit Hole of creating browser-based visualizations I'd recommend the book interactive data visualization for the web it's actually all about d3.js you can learn from scratch how to use this library with this book with some basic HTML CSS and JavaScript would definitely help that said d3.js has a bit of steep learning curves so I'd say it's more suitable for hobbyists who love creating more complex and pretty graphs however we might all agree that beautiful and performance visualizations often catch people's attention and create the wow effect and it definitely makes your storytelling more powerful so overall I think this book deserves a 5 stars for depth for the topic 4.5 for readability and 4 for practicality it was a long video thanks for sticking around till now and I hope you enjoyed this video If you find Value in this video don't forget to smash the like button and share it with anyone who needs book recommendations for video course recommendation for beginner us please check out my other video over here with that I'll see you in the next video thank you for watching bye [Music]
Info
Channel: Thu Vu data analytics
Views: 338,136
Rating: undefined out of 5
Keywords: data analytics, data science, python, data, tableau, bi, programming, technology, coding, data visualization, python tutorial, data analyst, data scientist, data analysis, power bi, python data anlysis, data nerd, big data, learn to code, business intelligence, how to use r, r data analysis, vscode
Id: uFTd2b23GvI
Channel Id: undefined
Length: 16min 12sec (972 seconds)
Published: Sat Oct 08 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.