ChatGPT + Jupyter Notebook = Mindblowing! 🤩

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
in today's video I'm going to share with you how to use generative AI tools to speed up learning and doing data science projects we are not talking about copy pasting things from chat TBD we are talking about using those tools right inside your favorite IDE such as vs code and Jupiter lab I believe these tools can read double or triple your process of learning and doing projects data cleaning pre-processing and exploration can take up to 80 of the time in a data science project so these tools can save you a lot of time for the more interesting and more important analysis but at the same time these AI coding tools are not perfect and there are several concerns especially if you want to use them in your jobs so we discussed these issues later in this video without further Ado let's get started the first tool I want to share with you is Jupiter AI it's a python library that provides a user-friendly way to explore generative AI models in notebooks it's especially convenient for people who use Jupiter lab but if you use other Ides you can also use it as well here's the documentation for this Library let me quickly show you how to install it on Mac OS I'm using the M1 MacBook but if you're using a different machine these steps can be slightly different but feel free to skip this part and jump right to the demo part which is more to see okay it's recommended that we use conduct to install this Library so if you don't have kunda you can go ahead and download mini condo to your computer and install it and after you finish installing you should restart your terminal window okay now we use conduct to install some prerequisites so firstly we'll install python 3.10 and Jupiter lab as well we use Python 3.10 because this is the latest python version that is supported by this Library additionally we also need to check if we have Jupiter server 2.x and not 1.x and we can check the current server version by running Jupiter version and check for the line beginning with Jupiter's server and here we do have the jupyter server version 2. so it's all good and now we go ahead and create a virtual environment just to avoid any conflict with any of your earlier package versions so I'll do condo create and test Jupiter AI is the name of my virtual environment and we are using python 3.10 for this virtual environment and now we can activate this environment and install jupyter AI for some reason for MacBook M1 specifically we also need to uninstall the grbcio package from pip and reinstall it with kunda don't ask me why I just followed the documentation so after doing all these steps we can check if the Jupiter AI server extension is enabled by running Jupiter server extension list if everything is enabled that's good and to verify that the front-end extension is also installed we can run Jupiter lab extension list okay as I want to use open AI models I will also pip install openai and then we can start Jupiter lab now let me create a new notebook and load the jupyter AI extension to use open AI models we also need to set our openai API key as an environment variable so let me quickly do that if you want to share this notebook with other people please make sure to hide your API key somewhere okay everything is now set up now to ask questions you can use the AI magic to write your prompt just like you do with chatgpt but the cool thing is that you can specify the language model that you want to use from this list we can see which generative models are supported by this extension and what environment variables we need to set in order to use them let's say that I want to use chat TPT model to write a python function that replaces all the missing values with zero in numeric Columns of a data frame and after a few seconds there you have it it's good to note that this magic comment also works anywhere the IPython kernel runs so Jupiter lab Jupiter notebook vs code or Google collab now if you don't want the text explanation in the response and only want the code you can specify the format of the output as well by adding the format argument in this case we can choose a format to be code the code output will then be automatically generated in a separate cell that's pretty cool right and we can also play around with other formats as well such as markdown math HTML Json or text what I like about this Jupiter AI library is that you can actually use different language models so you can specify the language model here and you have different outputs potentially using different models and you can compare and contrast and evaluate different models together one idea for testing them is also to use a coding example from a reliable source like a book and then based on that you can actually evaluate the outputs of the different models in addition to using Magics you can also use the chat interface to interact with the language models so you might notice that we have a chat icon here in the left side panel of Jupiter lab this chat UI is part of the Jupiter AI extension and here you can specify what you want to use as a language model in this case I use chat TPT and what you want to use as an embedding model and you can enter your API key here which is the same as what I had before there are many more things you can do within this chat interface for example you can ask about something in your notebook by highlighting what you want to explain so let me highlight this chunk of code and ask what does this code do in the chat panel and we'll check the include selection option here to include the selected code into my prompt then you'll get the response explaining what this chunk of code does another thing that I always forget to do is to write documentation for my functions so let's say if I have this function I lost the AI to create a dark string for this function describing the parameters and what the function does you may want to check and adjust some details but I think this is very useful and another way you can use the tool is to ask it to optimize a function that you just wrote for example if you know that you just wrote a very inefficient for Loop like the one that I have here you can select it and ask the AI to optimize it for you and then make sure to check the codes and see if it's working properly another cool thing I want to show you is generating notebooks after all a blank page can be scary and I hate starting things from scratch so you can ask Jupiter not to generate notebooks by starting your message with Slash generate comments for example I want to create a tutorial notebook of how to use Seaborn Library generating notebooks can take a few minutes so while you're waiting you can also continue to ask other questions and once the notebook is ready Jupiter AI will send you another message with the file name that it generated so let me open it quite basic and some things might not be working completely but you can see that this could be a very good starting point another useful thing you can do is teach stupid AI about local data so that it can include it when answering your questions this local data is embedded using the embedding model that you selected in the settings panel but be careful that if your data is confidential or sensitive make sure that you review the data policies of model provider before doing this if I have a document like this I want to teach to the AI I can use the slash loan comment and together with the path of the document and then you will receive a response when Jupiter AI has indexed this documentation into a local Vector database and after that you can use the slash ask comment to ask questions specifically about the data that you just taught you with the AI so this can be useful if you don't want to find information in the document yourself and to help the AI unload this document you can run this comment and jupyter AI will delete the local Vector database and forget all the information that you have taught it it's important to note that when you embed a big document or include a large chunk of code in your prompt this could cost you more money and so do watch your usage if this is the case the second tool that you might have heard of is GitHub copilot this is an AI pair programmer that is trained on publicly available code on GitHub GitHub copilot is available as an extension on vs codes or pycharm which are popular Ides for data nodes like us it's free to use for students and teacher and since I'm still a student technically although I do have a day job I can install it and use it for free otherwise you can pay like 10 dollars per month which is totally worth it if you can just save a few hours of your time in my opinion as I already have this extension installed in my vs code I'll just show you how I would use it you can use your comments as the prompts so if I say import pandas number and second loan the code will be suggested for me to install those libraries if you want to approve the codes you can just hit the Tab Key then I want to read the data set which is what I have here and then I want to display this data frame and show some descriptives statistics and maybe check for missing values as well now I want to also explore the data a bit more visually so for example I want to create a Pearson correlation Matrix of numeric features and display the heat map so it's all super super quick and I can also ask the copilot to create a plot for me to further explore the relationships between two variables honestly I don't remember by hard the code for all these steps so if I have to do it manually by myself it would take me maybe half an hour to an hour Googling and searching stack of over and searching stack Overflow or documentation so this tool can help you really speed up and get some initial results very quickly from there you're going to get more ideas for further analysis and decide what to focus on a free alternative to GitHub co-pilot is Amazon code Whisperer which is developed by Amazon apparently you can also use this tool as an extension on vs code of pycharm or Jupiter lab but I doubt it can be better than GitHub co-pilots also recently hugging face also developed a new language model for coding assistant called star coder it's a new kit in the blog is completely open source so that means it's completely free to use and you can also fine tune it to create your own coding assistant and you can play with it in the playground for this model here I think it works pretty decent for python however I haven't tried and tested star kodo and Amazon code Whisperer in detail so if you give it a try let me know what you think okay as we've seen earlier this generative AI tours can be very powerful and useful but you can't overly rely on them due to many limitations firstly privacy is a huge concern this often makes companies reluctant to let the employees use these tools for work purpose when using these tools you may be sharing your code or potentially sensitive information with a third party your conversations may be saved and used for further training of the model so we should definitely check the Privacy policies of these tools and I think it's is best to only use them for non-sensitive information using apis on private server may help to avoid this issue but I'm not an expert on data privacy and security so do consult some experts if you have doubts another concern is that a lot of times it's not clear which data the language models were trained on whether the code are open source or permissively licensed and this is another reason why companies are not yet willing to use these tools because they don't want to get into troubles regarding the intellectual property the third limitation is quite well known the suggested code can produce errors or incorrect results and perhaps even worse you might get code that is vulnerable to security exploits so instead of cleaning your own mess you might be now cleaning the ai's mess if you're interested in learning more about language model safety check out this earlier video over here and I hope this video is helpful and thank you for watching and see you next video bye [Music] foreign
Info
Channel: Thu Vu data analytics
Views: 34,363
Rating: undefined out of 5
Keywords: data analytics, data science, python, data, tableau, bi, programming, technology, coding, data visualization, python tutorial, data analyst, data scientist, data analysis, power bi, python data anlysis, data nerd, big data, learn to code, business intelligence, how to use r, r data analysis, vscode
Id: l2YU8QuXiTM
Channel Id: undefined
Length: 12min 40sec (760 seconds)
Published: Sun May 28 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.