R vs Python

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
python is an open source programming language commonly used in data science as is are which one should you be using at this point you might be expecting a fence sitting well it depends kind of answer but no i'm going to tell you exactly which one to pick right now so here goes i ask you a question and based on your answer you'll know which language to go for ready okay so do you have much in the way of programming experience none use r sum go for python lots r again i'll i'll explain okay question two do you care about awesome looking visualizations and graphics if yes go with r what about the problem you're trying to solve machine learning stuff go with python statistical learning r is your best bet and finally what do most of your colleagues use use that glad to get that off of my chest now we could all just finish here and go about our day but i'd like to explain a little bit more about what these two languages are and how they're best put to use because increasingly the question isn't which to choose but how to make the best use of both programming languages for your specific use cases so let's start with the slightly older of the two which is python now python was released in 1989 and it's a general purpose object-oriented programming language that emphasizes code readability through its oh-so-generous use of white space and it's super popular just behind java and c in popularity in fact there are some awesome libraries that support data science tasks so for example we have numpty it's actually num pi num t that's british slang for an idiot num t and numpty is used for large dimensional arrays and then for data manipulation we have pandas there are also specialized tools for deep learning so you can use things like tensorflow and you'll often find yourself working with python in jupyter notebooks as your ide now let's compare that to r which is optimized for statistical analysis and data visualization so it was developed just a little later in 1992 and it has a rich ecosystem with complex data models and elegant tools for data reporting there are thousands of packages available via the comprehensive r archive network otherwise known as cran and these things are for deep analytical tasks now r provides a broad variety of libraries and tools for things like cleansing data creating visualizations and training deep learning algorithms and r is commonly used with our studio which is an integrated development environment for simplified statistical analysis visualization and reporting so both r and python are open source and are supported by large communities continuously extending their libraries and tools really the biggest differentiator is how they are used and r as i've mentioned is mainly used for statistical analysis while python provides a more general approach to data wrangling you might use r for customer behavior analysis and then you might use python to build a facial recognition application now right up front i said if you have no programming experience or quite a lot of programming experience r was the better bet if you fall somewhere in between then python is easier to pick up but how can how can that be well python is multi-purpose it's considered a multi-purpose language much like c plus and java are and it has a readable syntax that's easy to learn it's considered a good language for beginner programmers or those with experience in similar languages now r on the other hand is built by statisticians and leans heavily into statistical models and specialized specialized analytics now novices can be running data analysis tasks within minutes with just a few lines of code using r but the complexity of advanced functionality in r makes it more difficult to develop expertise now a few other considerations to keep in mind and they all relate specifically to data now when it comes to data collection so actually gathering the data in the first place python supports all kinds of data formats from comma separated value files or csv files to jyson source from the web in contrast r is designed for data analysts to import to data from things like excel and text files now for data exploration then you can use the pandas library to filter sort and display data in a matter of seconds if you use python and r on the other hand is optimized for statistical analysis so you can build probability distributions or apply different statistical models and then finally data modeling has some differences too python has libraries for data modeling like numpty in r you'll sometimes have to rely on packages outside of r's core functionality did i see finally there's one more and that's visualization and with visualization r has the clear edge with a base graphics module allowing you to easily create basic charts and plots and you can use ggplot2 for more advanced plots such as complex scatter plots with regression lines r and python have their strengths but in truth most organizations use a combination of both languages you might conduct early stage data analysis and exploration in r and then switch to python when it's time to ship some data products so which should you use both you're probably going to use a bit of both and if you want to see more videos like this in the future please like and subscribe thanks for watching
Info
Channel: IBM Technology
Views: 274,418
Rating: undefined out of 5
Keywords: IBM, IBM Cloud
Id: 4lcwTGA7MZw
Channel Id: undefined
Length: 7min 6sec (426 seconds)
Published: Tue Aug 30 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.