R vs Python | Which is Better for Data Analysis?

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
what's going on everybody welcome back to another video today we are gonna be comparing python versus r we're gonna see which one is better [Music] now before i start this presentation yes i made an entire presentation for this video i have to address the elephant in the room about a month ago i made a somewhat controversial post i don't think it's controversial some people did apparently um and it's right here hopefully on your screen at this time all it says is python is better than r that's my opinion but it stirred up a lot of emotions for a lot of people a lot of people messaged me commented on it and apparently a lot of people took offense to what i said and they wanted me to explain why i felt this way and i did not respond for the pure fact that it was more fun to watch them argue and complain rather than answer their question but also i knew i was going to be making this video anyways and so i just figured they could watch this video whenever i put it out which uh is right now and so if you are only on linkedin and you've never watched my channel before and you are just seeing for this for the first time and you remember that time when i posted it i hope this will be clarifying for you um and so without further ado let's get into the presentation uh and we will start from there all right so some of the things that we're going to be discussing today in our python versus our presentation is we're going to talk about descriptions different libraries the code syntax pros and cons of both and my final answer uh i will say before we get into it i'm not trying to go super in depth uh i i tried to make it as user-friendly as possible if you know you guys are really wanting a more in-depth presentation on just one of these i can absolutely do that i plan on doing that at some point but this is going to be kind of high level and more talking about my thoughts and my feelings regarding this because it is a very emotional thing i believe without further ado let's get into the description of both again keeping it more high level and kind of getting to some specifics and then my conclusion so let's look at the description of both python and r starting with r r is a programming language developed for statistical analysis and the people who mostly used it for a long long time where statisticians and just recently within the past you know five ten years has really been used for data science and data analysis and visualizations and all of those things it was developed in 1993 again like i just said primarily for statisticians data miners and analysts and it's used by a ton of very large companies some of them are uber facebook and google but there are tons of companies and even small companies that use r and so if your company does any type of statistics or statistical analysis there's a good chance that your company has either used r in the past or is currently using r as a programming language now onto python python is a general purpose programming language it's used for almost anything you can imagine it may not be the best thing for every single thing it can do but it can do almost anything and so it's very general very broad it is quickly becoming the most popular programming language in the world and it is used by companies like google facebook and netflix now if you notice in the companies that use python and are both facebook and google are on that list then that wasn't by accident i did that on purpose because i wanted to show that these companies large companies are going to use both programming languages for what they're good for which obviously we will talk about later but i wanted to just kind of put that there for um i guess foreshadowing now before we look at libraries and packages i just want to say that if i did not highlight your favorite library or package on here i am sorry there are so many especially with r there's just hundreds and thousands of different packages and libraries uh i just can't possibly put them all on here and so these are just a highlight of some of the more popular ones the ones that i have personally used and so i hope that you are not offended by that but let's start with r for data collection you can use things like our crawler read excel read rl and r curl for data wrangling exploration there's dplyr sql df data.table read r and tidyr and for data visualization there's ggplot2 ggviz plotly squis and shiny and over to python for a data collection there's pandas requests and beautiful soup for data wrangling and exploration there's pandas numpy and scipy and for data visualization there is matplotlib seaborn and plotly again this is just a high level overview of some of the packages in each of these programming languages if you have never used r or python i think these packages are a really good place to start now for the code and the syntax on both of these i tried to stay neutral on this i tried to just kind of say what everyone else was saying because i have my own very strong thoughts and opinions on this uh but you know i wanted to stay somewhat unbiased at least for this one um but for r it's easy medium difficulty to pick up and start working from from scratch you know if you've never picked up r it can be kind of difficult to pick up um a little bit more advanced it can be difficult to maintain your code especially as you start to scale your code and so that is a big problem that a lot of people have addressed or talked about with r with python again it's easy medium difficulty to pick up and learn i i think it can be about the same difficulty as r in my opinion and that's what a lot of people said and so that's not just my opinion but it's easier to write and maintain larger scale code and so as you start building larger projects or join larger teams or take on more data it's just easier to scale up now into some syntax examples i 100 cherry pick these but i do feel like they're pretty representative of what the code looks like as a whole and so a lot of people are probably gonna get mad at me saying no r is much easier than this and you may be right in some aspects but for the most part i feel like this is fairly accurate we're just reading in a csv file and then trying to find the mean on a column or a field and that's about it and as you can tell r is just a little bit more difficult a little bit more complicated python's a little bit more cleaner it's a little bit more easy to read and pick up and that's something that a lot of people say about python it's very easily readable now let's look at some of the pros and cons of both we're going to be starting with r some of the pros are that it is open source it is fantastic for statistical analysis has hundreds of packages and libraries purely for analytics and that's what r is it's purely for statistics and analyzing data and lastly it is easy to build visualizations with r now for the cons it can't be embedded in web applications and from what i've read that's purely for security reasons and so that is a big downside of using r you need to know a large amount of packages and libraries you can't just know like one or two kind of like in python you can know pandas and you can do a lot of different things with it r doesn't really have that you have to know several things in order to get kind of one task done and lastly r can run slow because of how they store their data so those are some of the pros and the cons of r now let's move on to python some of the pros for python it's open source it's easy to read and learn especially if you're just picking it up for the first time it can be embedded into web applications which can be very important for a lot of people and there's a growing number of libraries for data analysis there are of course growing number of of libraries and packages for r as well but those are quite more well established while python is still growing and they're coming out and they're catching up to r fairly quickly for the cons the processing speed can be slow especially depending on what library or package you're using but you know i think that's a con in both r and python on some level they're going to run slow it uses a large amount of memory kind of part of the why it's running slow it's simple to learn um and simple to use and sometimes that's an issue actually because it's so simple when you need to do really complicated things it can be kind of hard to do where an r that's what it's built for it's made for those complex calculations and so that's why those packages and libraries are built the way they are and lastly the libraries for all analytics needs are still being developed and so yes it is a pro that those numbers are growing uh but it's still a con that they're you know behind our and so r has more being developed and more already developed in terms of all their libraries and packages being built out or python it is still growing now on to my final answer which is better python or r it really depends um but going back to my linkedin post that we talked about the very beginning i will say that i still 100 believe that because to me for my type of work the stuff that i do python is 100 times better it's 100 times more useful and so to me python is better than r but it really does depend on what you're using it for and so if you're doing purely statistical work r is going to be the better choice if you're doing machine learning python is arguably much better in my opinion r is harder to learn but it has more features while python is easier to learn but isn't as developed yet and so what i genuinely think you should do is i think you should try both i think you really need to get some hands-on experience take a course in both just see what you think and and determine for yourself what you think is better i really will go back to that linkedin for a second i believe that for me personally python is just better i can use it for so many things it is in my opinion much better suited for me and what i do for my job and so for me python is way better but for other positions and other people are maybe the programming language of choice and i'm totally okay with that there were a lot of people in the comments who were writing you know it just depends and and you know why don't you think that one why do you think that one is better than the other you know why can't it be both and i really wanted to respond and be like i agree with you uh but i didn't because again i thought it was more fun and i knew i was making this video and so i i genuinely in the bottom of my heart to all those people i agree with you and so i want you to feel some vindication some sense of you know you you you were right and so i hope that this was um hopefully a good outcome for what you're hoping for uh i have nothing against r i have used it um and i and i've taken a few courses on it um i have not used that much art in my actual job although the data scientists that are in my department use it quite a bit i mostly stick with python and so again that's why i like it better but i can honestly say that i've given both a fair chance and so i think that you should do the same i think you really should test out which one that you personally think is better thank you guys so much for watching i really appreciate it if you like this video be sure to like and subscribe i feel like it's worth subscribing to i got some pretty good videos i got a lot of videos coming out soon thank you for joining me and i will see you in the next video [Music] you
Info
Channel: Alex The Analyst
Views: 54,678
Rating: 4.84271 out of 5
Keywords: Data Analyst, Data Analyst job, Data Analyst Career, Data Analytics, Alex The Analyst, r vs python, python vs r, r for data analysis, r for data science, python for data analysis, python for data science
Id: 1gdKC5O0Pwc
Channel Id: undefined
Length: 11min 51sec (711 seconds)
Published: Tue Feb 16 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.