Tutorial: Getting started in Python

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
well here we go it's uh time to start this uh introduction or getting started in python in transform 2022. so uh welcome to everyone wherever you are this event transform 2022 has already started over the weekend but today i'm going to be kicking off the tutorials i'd invite you to all visit transform.softwareground.org you'll find all the information about the various tutorials available to you this week and everything is also in the schedule which you can reach by going to the short url sw dot ng forward slash t22 schedule that again will if i open this that will take you to the schedule here and you'll have links directly to the youtube channels there these obviously will be um available after the event as well so without further ado i just want to get us started so my name is rob and i'm going to be taking you through this getting started with python tutorial just to set the scene this is really uh aimed at people with no python experience whatsoever and i'm not teaching you python here what i'm aiming to do is to just give you some tools and give you some exposure to what you can do with python and how it works so that later in the week or maybe in a month or two months when you look back at some of these tutorials you'll be able to follow along really and then in time you'll write your own code now if you're watching live and you want to follow along um there'll be links you'll find to this git repository i'm not going to explain git in detail here suffice to say that you can find all the material that i'm going to be sharing with you today on this web page and if you want to download it if you navigate to this page t22 getting started go to the code button this big green button here you've got access to a zip file that you can download and that will include all the data all the files all the images that i'm going to to use here today now in the in this tutorial i will be running through this intro to python live notebook and i'll illustrate what a notebook is shortly you're welcome to code along in there if you like if you're coming to this later and you want to have the answers as it were you can have a look in the main notebook which i won't be showing live but i'll be referring to um during the introduction now i may depart slightly from what you see in maine because i'm going to be live coding in this notebook so there'll no doubt be some some differences a couple of files i want to point you to first of all is this installation and setup so this is a file that's written in what's called markdown and i'll talk about that briefly i'm going to go through this but on a separate tab using something called reveal md just to look at the slides but if you want to look at the raw um text you're welcome to look at it here um what else um yeah that's about it so with that i will get started with the installation and setup and just try to get us all on the same page so as i say you can follow this on the github markdown file if you prefer but otherwise just follow along here so what i'm going to do i'm going to kill my webcam so that hopefully this works smoother so we'd like to get python installed on your computer and let me say straight away it doesn't matter whether you're on windows mac or linux python will work on all of them so the first thing i would suggest is navigate to python.org now you may not want to install from here i'm going to show you two different ways of installing but the the easiest i'd say way to install python is simply to go here choose your operating system windows mac or whatever and then download python and install it just like any other program you would on your operating system now in this particular tutorial i'm not going to be following this um i'm using something called mini conda now what is mini conda it's a minimal installer for conda now i realize that's yet more jargon conda is a program that allows you to install packages that python needs and mini conda will include that program so that means you've got conda you've got python and all the packages that they depend on so packages is just a word for code so other code that other people have written that these different languages and programs depend on so if you want to use mini conda i've included links to the installer links over here so you can come to this page find your platform windows mac or linux select the correct 64-bit most likely installer and then just go ahead and install that so run the installer accept the license except the defaults and at that point you've got mini condo installed now at this point you won't yet have a full installation that allows you to follow the tutorials this week but we're going to get to that now another way you could approach this is to use anaconda now um anaconda includes python it includes conda just like miniconda but it also includes hundreds if not thousands of other programs or packages that you might want to use for data science for machine learning for visualization and so on now you might choose anaconda if you're completely new to condor python and you like the convenience basically of having python and over 1500 packages automatically installed at once which means you don't have to come back and install them again later you have to have some time and some disk space because it'll take a few minutes and i think it's about three gigs so on most modern machines that won't be an issue but perhaps it is for you now anaconda is also good if you don't want to individually install each of the packages you want to use and you basically have access to a set of packages that are curated and vetted um for interrupt interoperability and usability so basically you know that everything is going to work however do be careful if you're working for a corporation because they are licenses for anaconda if you're a student if you're doing this as a personal project that's totally fine but if you're using anaconda inside a corporate environment you'll need to get a license um now you might choose miniconda in contrast if you don't mind installing each of the packages you want to use and i'm going to show you a way to do that shortly if you don't have the time or the disk space to have everything in anaconda and you might just not want to have everything and if you just want facts fast access to to python now just a brief note um on different operating systems so python works on windows mac and linux and actually other um platforms as well there's links to that um right here if you're interested but i'm not going to dwell on that okay so why would we need a coding environment i've talked about packages i've said that you know we get python but then we have these other things we need to install why can't we just do everything with python on its own well if we look at this plot as an example to create this we've needed to load different types of data so logs last files for the geoscientists amongst you some maps some seismic and then we're actually plotting them and in order to do this we're using some python code but that's been written by different people so we're using libraries called numpy matplotlib welly and each one of these is a collection of python code which means you don't need to reinvent the wheel that allows you to solve a given problem so numpy will allow you to solve numeric python problems so basically anything to do with mathematics and science matplotlib will allow you to make graphs and plots and here i mentioned welly welly will allow you to load and manipulate log files class files now what we want is we actually want to delegate this dependency management to a program what's called an environment manager and i've mentioned one of these already and that is conda because what you do not want to do is to install for example numpy and then go and look up yourself what other packages numpy depends on and so there are tools that allow you to do that now those tools include amongst others conda which i've talked about and which will we use in here pi end virtualenv and many others now in this particular tutorial we'll be using conda and if you go to the documentation there you see everything you need to manage what's called an environment now an environment is essentially a collection of software at a given date so if i was to create an environment today in order to make the plot that i just showed you all the software that i installed would be dated 25th of april 2022 now if you come to this video in six months time some of those software's most likely will have been updated and so if you create an environment some of the code might be slightly different so that's why sometimes you want to manage these environments and in particular use an environment file that can tell you not only which packages you want but also which versions you want that's not always something you want to do but it's uh it's a possibility now i've also linked to stack overflow now if you're new to programming this will be a new website to you um but i'm pretty sure you'll you'll end up on it at some point it's essentially a um questions forum question and answer forum and uh it's one where there's a really good discussion about what the differences are between condor pi and and virtualenv so in conda how do we actually set up an environment as we might want to do for this week's tutorials well the first thing that we're going to do is we're going to download an environment.yaml file i'm going to click this link just to show you what it is and if you're on windows just leave it in your download folder and then the following command will just work if you're on mac or a linux environment just download it wherever you want and then in your terminal navigate to that folder before running the command that i'll tell you shortly but before i do that i want to show you what we're actually downloading so this environment.yaml file is just a text file nothing else and you can edit it it does have to follow some indentation so you can't just um free like write in word or whatever you have to write in something that respects the white space and it has several things in it we'll revisit this later but just at this point i'll say it's got a name this is going to be the name of the environment the channels are going to be on anaconda where you're grabbing the data from the programs the python code and then it's going to list all your dependencies and we'll come back to these a little later in this short installation presentation just to talk about what we have at this point i just want to say that we're saying a name a source and some code that we're downloading now once we have that again if you're in windows you can run this command all on one line you'll have to open what's called an anaconda prompt now on windows if you've installed mini conda or anaconda you will have a new program which you can access for your windows search key called anaconda prompt so open that terminal that uh that prompt and then in there type on one line conda n create minus f and then all of this and you'll if you if you're following this from the um github markdown file you'll be able to just copy paste this command if you're on a mac or linux environment you won't have an anaconda prompt however if you've installed miniconda or anaconda and you followed the default installations you'll be able to run these conda commands directly from your terminal or your terminal emulator now if you're on mac or linux as i mentioned earlier just save this environment.yaml file wherever you want perhaps in your downloads folder navigate to that downloads folder and then you simply run um condane create you don't even need to have minus f and all the rest of it um because that will be the default so at this point we've created an environment and as you may have seen up at the top here it's one of two slides um so our environment is created what that means is that we've downloaded the packages but the next thing we need to do is actually make it available to the code or the software that we're going to use later as our coding environment so to do that the first thing we need to do is conduct activate and then the name of the environment remember the name was found in the environment.yaml file and all of these commands we're typing in the anaconda prompt or in your terminal and the next one again all on one line is going to be python minus m ipy kernel install minus minus user minus minus name followed by the name of the environment again that name that you find in the environment.yaml file and so what we're doing here is we're using a package in python called ipy kernel inside that package we're using a function installed and we're passing arguments to it so that we make this this environment available to the jupiter notebook now i realize i've not mentioned jupiter notebooks yet um we'll just have to hold that thought for now we have to run these commands so that we've got access to the environment later on now at this point we would have just created an environment solely with the things listed inside of this environment.yaml file so only with these things which again i'll explain a little later but you might want to add things to it so if you want to add to an existing environment you need to add a package so to do that you first need to activate the environment once again to make sure you're installing only inside of that environment and then you have two choices if the package you want is available from anaconda you run conda install package name if however the package is not available there but it's available only on the python package index which i can show you python package index is basically a place where there are as you see just under 400 000 projects so these are python programs that you can install and you install these programs specifically with pip so either you use conda to grab something from anaconda or you use pip to grab something from the python package index and again pip install package name so at this point we we basically got our environment and we're essentially ready to start coding now to start coding following this particular approach which as i'll explain shortly is not the only one what we would do is run jupyter notebook now you would do that from a location on your computer where you have data so for example if you were following along this tutorial you would have downloaded all these data from the zip file that i mentioned initially so you would have downloaded this zip file and you would have unzipped extracted that zip file to some location on your computer and then in the anaconda prompt or in the terminal or in mac and linux you would navigate to that location and then you would run jupyter notebook now when you do that you'll see in your terminal your your anaconda prompt but you start what's called a server i don't know if this will be visible on youtube or if it's too small but basically in your anaconda prompt terminal you start a server and then your computer will open your default browser with um with the jupyter notebook now we're going to look at that in just a second well actually let me show you what that would be so if you open that if you run that the jupyter notebook will open and if you've opened it from the location where you've downloaded all the data you'll see something like this now before we look at this we need to do a couple more things first let's have a look at this environment.yaml file to see what we actually installed so let me go back and open this environment.yaml file and now let's have a look at what we've got down here so we've installed python of course we've installed pip because that won't be there by default and we'll need this to install other packages then i already mentioned numpy and matplotlib so these are for numeric python and plotting jupiter this is our chosen coding environment and we're going to talk about that next um ipy kernel we've just used it and this is something we need to make this environment available to jupiter and then pandas and open pi excel are things that you might need if you're dealing with 2d data sets spreadsheet like data sets open pi excel to deal with excel files in particular i added requests there that's simply to make queries on web pages and then from pip we're getting packages that are more geoscience focused so bruges for geophysical tools welly and well path pi to deal with well logs and deviations segway i o and segway sac to deal with seismic data and gio to deal with file loading of common geoscience and subsurface formats so this is obviously a bare bones well not obviously but this is a bare bones environment and if you use this environment in one of the tutorials later this week you will no doubt need to install additional packages which the various presenters no doubt will either share with you in another environment file or simply tell you how to install them with conda or pip so now we've seen that let's choose a text editor so if you want to write code well yeah you need to write it so what we're using here or what i'm going to be using is the jupyter notebook now you've seen this image before and the jupyter notebook essentially allows us to share or to combine formatted text code code outputs images everything you need to prototype some code now there are some pitfalls to using the jupyter notebook especially for beginners we'll touch upon them perhaps but it's it's not something i think that should stop you from using this now if you want you can use a more traditional text editor there are plenty out there i'm just listing a few here and there are links if you want to to visit them it's really up to you what you what you choose as long as it works for you it's the correct editor if you want to be more old school again there are plenty other other editors out there all totally valid and again it really doesn't matter which one you choose as long as it works for you so at this point we've created environments we've chosen a text editor i've picked the drupton notebook and now we want to run python so how do we do that or how can we do that well we can run python in multiple ways we can execute python code in an interactive session what does that even mean well it means that inside your terminal or inside your anaconda prompt you could just type python and it would launch that program and then you can write your code that's fine for quick testing for checking features in python but obviously that's not where you're going to write scripts that you want to reuse scripts are just python files containing python code so you can run a python script directly from the operating system and we've actually done that already when we ran uh python minus m ipy kernel etc there we were running some python code using python and finally you can use something like the duketo notebook a binder instance google collab any of the others and again it really doesn't matter which you you choose or you use it's whatever works for you so with that we're ready to go into the actual notebook so let me close a few things here close this i need to close things on another screen as well there we go and now i'm going to go into my jupiter notebooks remember we've started this and as i mentioned uh in this live youtube presentation i'll be coding in this intro to python t22 live that's the one i've got open right here so first let me give you a quick tour of the jupiter notebook so as you see it opens in a web browser whatever your favorite browser is it doesn't matter i would say that some things and in fact some things i'm going to show in this demo will not work in internet explorer but if you're on edge safari chrome or any chrome based browser you're basically good to go firefox as well so in the jupyter notebook there's a few things that are good to check when you launch it first of all up here on the right you see the name of our environment and so straight away i'm going to show you that if i go to kernel and change kernel down here i see t22 setup now that is the kernel or the um the code collection the environment that we're going to be using in this demo and i would not have had this available to me here had i not run that command python minus m ipy kernel install etc etc so that's why we needed that then we have standards you know file open rename copy etc editing we'll talk about that because i need to explain what cells are things about viewing inserting the kernel i've touched on this basically the kernel is the python code running in the background and executing your your code and so sometimes that program might freeze or break and so sometimes you'll need to restart it but in general they're pretty stable then if i click over here you see that this first cell gets highlighted in blue and as long as i'm in blue if i use my up and down arrows i can navigate up and down in the notebook if i hit enter the cell outline turns to green and now if i use the up and down arrows i'm inside that cell i'm editing that cell if i want to go back to blue i can hit escape and then i can navigate but you notice that the file or sorry the cell is no longer displaying as it was so to do that what i do is i hit shift enter and that executes the cell so you can navigate or you can switch between editing and navigating with enter and escape and once you're happy with the cell you can execute it with shift enter there are other things you can do as well but i don't want to just say everything at once so um there are plenty of shortcuts and i'll use perhaps a few in here if you want access to the shortcuts you can go to help keyboard shortcuts and that will list all the inbuilt shortcuts you can actually edit them to write your own if you prefer i'll be using things like a to insert a cell above b to insert a cell below um dd to delete a cell things like that i'll try to tell you what i'm doing every time i'm doing it but in case it's all there now we can have formatted text and if i double click i can show you first of all here we've got a title with a single hashtag here we've got a subtitle with three hashtags we've got a list with these ticks and here we've got hyperlinks so you can pass a hyperlink with square brackets and the name you want to have and then parentheses and the actual target so if i do this we'll see that we've got links to the original markdown file the original markdown page where john gruber explains what he wrote here you've got links to getting started in markdown and the cheat sheet as well and i invite you to go and visit those links if you're new to markdown very quickly um so if i double click the cell i can go into edit mode as well so an h4 heading here we've got unordered lists so basically just um a bunch of um dashes and we can obviously indent them to have sub items an ordered list will be numbers and then we can have all the usual things italic bold underlines so i've put underline here this is not strictly markdown but this works in the notebook you can have quoted text let me show you what that is so quoted text will appear like this and you can have tables and again don't remember stuff you can look up quickly just go and grab it from somewhere else and modify it to match your needs so that's formatted text now we can have mathematical equations as well you can write latex so if you're not familiar with latex i've given you a link there to it so here we've got an inline equation and here we've got a centered equation and the way you do that with the inline equation you simply write your latex code between dollar signs and if you want to center it you have two dollar signs at the start and the end for the actual syntax to write to write these equations i'd invite you to basically visit the latex documentation we can also have code so code inline like this or multi-line code and the way we do this is we use backticks for inline code so backticks well i don't know what keyboard you're on but they're not quotes or double quotes they're a special character and if you have three backticks and you give the name python you can actually format python code so write python code and it will be rendered with the correct syntax highlighting in the jupyter notebook so next what we can do is we can actually have code output so here i've got some python code perfectly valid and if i run this you see that i get an in one so this cell has been executed the code has been executed and the output of this function print simply gives back to me whatever was inside my quotes here so hello transform2022 so here we go we've got formatted text we've got inline code we've got formatted code and now we've got code output and finally we can actually have images as well so here i've got a link to an image which is stored inside of this same folder in the images folder and if i execute this cell you see that image is rendered in my notebook so hopefully there you see that uh with a dupe to notebook you can actually document your work really well and i think at least several of the tutors this week will use jupiter notebook or perhaps jupiter lab so you'll most likely see this kind of thing used this week now with that i want to move into python and actually show you some python code we'll start with things that you most likely already know even if you've not seen them in python and here i'm using keyboard shortcuts a and b to add cells below so what we'll start with is simple addition nothing strange going on here i'm putting spaces around the operator that's python syntax it will work without the spaces but you really ought to use them so all the usual things that you'll expect plus minus multiplication that works every time i'm executing the cell with shift enter um let's see division so 11 divided by two here we see something slightly different so the operator is not strange or anything um but now we see that we're getting back a floating point answer which we expect from 11 divided by 2. so in python you will get back the the most expected result so 11 divided by 2 won't give you 5 remainder 1 it will give you 5.5 now similarly if i do multiplication 45 times 1.4 i get back with multiplication if i use a float or a value with a decimal value a number with a decimal value i will get back a float as well even if that is um something dot zero i'm not going to get back the integer 18 i'll get back to float 18.0 now if i come back to this example here 11 divided by 2 sometimes you may want to have the long division or the integer division and that's available to you in python if you pass two forward slashes so that will return five and then if you want the remainder of this division you need to use what's called the modulo operator and so the modulo operator is the percent symbol and that will return to you the remainder of the long division so addition subtraction multiplication division and long division and the modular operate operator um i'd wager that apart from the long division and modulo all of that's pretty familiar now the next thing i want to show you is exponentiation and in python we do exponentiation with two um two stars um generally you won't see spaces around them but again it'll work with or without um so they go exponentiation now there are more but again this is really just a quick intro i don't want to dwell on all the intricacies that you might see i do want to mention logical operators so again i'm going to use b and a to add a few cells here and so logical operators are things we've seen before so greater than for example 23 is greater than 4 and so that returns to me what's called a boolean and in python booleans are true and false with a capital t and capital f so likewise if i say i don't know 543 is smaller than a thousand or smaller than 10 um that's going to be false right 543 is greater than 10. so no no surprises there hopefully now you also have greater than or equal so if i do i don't know 200 is greater or equal to 199 that's true and likewise i can do 200 is smaller or equal to 199 and that will be false so greater than smaller than greater or equal smaller or equal now if i want to check for equality for mathematical equality i could say 32 is equal to 32 with two equal symbols that is true if i want to check for inequality if i say for example 32 is not equal to 42 that is also true so not equal is with the exclamation mark equal symbol now we can combine these so for example i can say 11 modulo 2 remember that gives me one well i can say this is equal to zero that's going to be false so here i'm getting an expression and then i'm getting the result of that expression and doing a boolean comparison to a value um but if i wanted to i can combine these so for example in parentheses i could say 4 is greater than two um and in parentheses 50 is small or equal to ten and so this is false because one of them is false if i wanted to use all but i can do the same thing actually i can copy this and change that to an or and that is true because one of these two is correct so 4 is greater than 2 and therefore this is true however 50 is not small or equal to 10 so this top one is false okay now assignment here i've got something already written i'm saying x is equal to 10 and remember that mathematical equality is two equal symbols not a single one so if i say x equals 10 first of all if i run this i get no output hopefully you'll have noticed that every cell so far we had an in and an out in and out with the assignment um people often say assignment is silent in python and that means that if there are no errors during assignment you get no output however now if i look at x x and here i've got an output x has or x holds the value or points to the value 10. so hopefully everything so far even though you may not have written python is not too surprising and hopefully you actually know a lot of this already from mathematics at school so now we're going to set step sideways a little bit and look at different words and collections in python so there are different ways to hold data in python the first one well actually we've seen integers floats and bools already but here we're going to look at strings so if i have a word and this is just a name it could be anything you want and i assign that to a value and i'm going to say the word is python you notice that word as a name has no quotes so that's like a variable in a mathematical equation you say you know let a be let a hold the value of 42 or whatever so that would be my a and then python in quotes that's the actual value being held so if i run this cell now i'm asking for the type of word and the type is string str now if i look at word get back python so indeed word is holding the value python now here i'm using uh single quotes let me just illustrate if i was to do double quotes and i say this is actually the same thing if i look at double quotes you see that i get back um or python shows me my object again with single quotes so it doesn't matter which you use and you'll see both used simply they have to be consistent so you can't mix them however the reason we have both is that it allows you to put different types of things inside of text so you might have for example something like this where you say hello y'all and this only works because i've got double quotes on the outside because otherwise with a single quote here if i had let me show you oh sorry if i had single quotes on the outside i would get an error and immediately you see that the syntax highlighting is showing me this in red and then this in green if i run this we get a syntax error because python is reading this string and when it gets to the end of it it stops and it doesn't understand what you're doing here so to sidestep this you can use double quotes and single quotes errors in python are generally very helpful and so this is one that's very short they can be much longer but if you if you get an error in python always look at the very last line it'll tell you the type of error that you're dealing with and sometimes it'll actually show you where the error occurred so you can go back and look in your code and hopefully fix it okay now strings have got different attributes one of them is you can pass them to the function len and it will tell you how long how many characters are in that string and you also have string methods so again i'm adding a few cells down here so for example i might say words dot upper and here i'm using a new syntax i'm saying word that's my object and then dot and then a function and i've got parentheses after that function because i'm using that function i want to execute that function and the jargon for that is to call the function so if i call the function upper on the object word i get back python in uppercase however in the jupiter notebook if you put your mouse or your cursor inside of the parentheses and hit the shift tab that's shift and tab you get back help on the function or the object and here it tells us that this is returning a copy of the string converted to uppercase so that's great we're getting back python in uppercase an important word that's easy to skip is that this is a copy so if i look at word now it has not changed word.upper gave me a copy of python uppercase word is still in lowercase that's really important because in python strings are what's called immutable you cannot change a string in place you can overwrite it you can you know change it well replace it but any method that you have any function you have on a string will return a copy to you now there are plenty other methods available if i don't if i don't do shift tab if i do if i type my object i hit dot and then hit tab i get this little pop-up menu that shows me all the methods attributes available to me so for example i might check is this a digit and if i don't know how this works then i do shift tab and this returns true if the string is a digit string false otherwise you notice that you need parentheses to run it to call it and so this is not a digit right these are letters so that gives me back a boolean okay so let me go to another type in python which is a list and here i've got a new name word list and i'm just passing instead of a single value so python i'm now passing multiple strings comma separated inside of square brackets now if i run this excuse me if i run this again there's no output right now if i do what i did every time so if i check the type of word list here i see it's a list so that's a python type just like string float ball all of the others that we've seen and as before we can check the length of word list so we have five values in there python geology programming code and outcrop i can look at word list and get back the object um something you can do here that you can actually do with strings as well is check for membership so for example i could say well tell me if geology is in word list and in here this checks for membership and this returns to me boolean value so yes geology is in the word list however um i just want to draw your attention to the fact that theology with a capital g is not in word list because python is case sensitive now just like with strings we have methods available to us so as before i can do dot and hit tab and look at the methods so the first one we see there append is very common and if i check its documentation we append an object to the end of the list and so i might add i don't know mineral if i run that i don't get any output so it's a bit like an assignment but if i now look at word list indeed i've added one value to the end of it okay now if we've created these lists and strings very often we're going to want to reach into them and grab one or several items out of them so how do we do that well we do that with what's called indexing and slicing indexing is if you want a single value and slicing is if you want multiple values now there's a couple of things to remember when we're dealing with indexing and slicing in python and this applies not just to lists or strings or it applies to everything really is we start counting at and for indexing and slicing we're going to be using square brackets so let's start with string so remember we had word it was python and remember that that had a length of 6. so how do we index into that well as i mentioned we'll use square brackets if i want the first value so p for example i can pass 0 in my first brackets in my square brackets and i get back p now if i want the n i can say give me um the index five now notice the index n right or the index five for n is five so it's one less than the length of the word because we start counting at zero so if you were to index or to reach in at index six you'll get an index error these are very common and in general it just means that you're reaching uh you're off by one you're reaching outside of the list or the object so that's how we can get a single value out of a list and or in this case a string it's the same with list as we'll see shortly if we want to grab multiple multiple values out of an object we use two numbers and this will give us a slice so for example i can say go from zero and then i use colon and go up to three and this gives me the first three characters p y t however if you just look at what we're counting so we're going zero one two so we're not actually grabbing the value at index three so this interval is uh is a half open interval right so we include zero we exclude 3. that's something that you need to be aware of and you'll see this quite often all the time really in python so i'll keep talking about indexing and slicing but now we'll use a list instead it's the same principles but we're just changing the object that we're targeting so if i go back to my word list i've got all of this no surprises there i can do the same thing as i did just a second ago grab the first one if i wanted to grab the last one and i don't know how many there are i told you that you get the value that's one less than the length of the object so you might do something like this where you say len of word list minus one and that indeed gives us mineral but this is way too long and so what people writing python do is you just say word list minus one and what this does is it will start counting at the back of the list so you start counting zero one two three four et cetera and you go from the back minus one minus 2 minus 3 minus 4 etc now if you wanted to grab an interval that includes the last value because i told you that the last value that you put into an interval is not used and what you need to do is you need to omit it so how does that look well if i wanted to go from index 4 to the end i say 4 colon and that gives me outcrop and mineral so 0 1 2 3 4 and the last one and you need the colon because you if you don't have the colon and you just say index 4 you get a single value you don't get a slice okay so now we're looking at another type in python a dictionary so this is a container we can think of it as a container and so here i'm giving it the name container and i'm creating it you see the syntax is a little more complicated than what we've seen so far we've got curly brackets and then essentially we have key value pairs so the key is in this case a string the key can be anything that's immutable so we've mentioned strings are immutable so they're good candidate and the value on the other side of the colon is anything you want so i'm using word list word and a value here of 2022. so if i execute this cell i get back a dictionary and as before if i do type of container i get back that this is addict in python so no big surprises there now we can look at the methods available to us here if i do dot tab so i'll start with keys for example okay this we see that we get back words word and transform so these are the keys words word and transform that are holding the values in the dictionary now i talk about values so let's look at those the values are this first list that contains everything in words the second value is python and the third third value is 2022. now i'm not going to go into details dictionaries suffice to say that we don't use square brackets to reach into them well we can but a method that is very common is to use for example the get method and you can say get the value inside of this container whose key is word and that value is [Music] is python right because word was python so remember the word is python so because i put word in there i get back python okay now dictionaries a massive topic on their own i won't go through all the details here but you'll see them in no doubt all of the tutorials so it's important that you at least understand that they're containers of data and you don't access data in them in the same way that you access data in strings or lists now so far we've just used pure python so i didn't have to add any additional packages or libraries or modules it's just been pure python now how do we add more functions or more functionality so sometimes you need to reach for something that's not loaded by defaults by default and there are three main ways to do that you can import the whole package you can import the whole package with an alias and then you'll access things with name of alias dot function or you can say from package import function 1 function 2 and then you just have those names so here what we're going to do is we're going to illustrate that by importing from date time this function called date time and it's not the best name for them but i want to illustrate what we can do what we can do with it so what we're going to do is add today's date to the dictionary so to do that i can go to date time and i can say date time now and if i execute this i get back on my particular computer at my locale the exact time right now now what i can say is just give me the date i don't want the time i just want the date so now i've got 25th of april 2022 and perhaps i want to change that format into something that's more standard so i can say block iso format and then i get back a string remember that anything in quotes in python is a string so i get back a string with today's date so if i wanted to update my container remember i've got this container here and say container dot update in there i can pass a dictionary um where i might say for example location is global because trans transform is global and i'm going to say date is and then here i can say all of this now if i run that and then i look at the container you see that we've added location global and we've added a date today's date and if i reach in now i told you about the get method you can use square brackets to get something out of a dictionary as long as you know the name of the key and if you do that you'll get back today's date now do beware that if you have a typo in your key you'll get a key error okay so keying into a dictionary let's have a look at that quickly if we do container dot keys to see what keys are available to us so just as a recap i can do container dot get location for example or container square brackets date but do be aware that if you've got a typo in the date you get an error if you've got a typo in the get method so if i do dot get a fb i get no errors so i don't get this key error that i got above i just don't get anything back now if i want i could actually pass a value here and say i don't know this key does not exist and then i would get back that string so again square brackets are absolutely fine if you're 100 sure you've got the correct key if you're not sure use a get method as it's safer okay so now let's change gears and look at controlling the flow in our python code so we often want to make decisions based on code and for that we need boolean values we've seen these before things like you know x is greater than five or i don't know um word is equal to python or something like um global is in container dot values so all these things giving back a boolean and it doesn't matter what kind of thing you're using as long as you get back a boolean and now i can use this in a basic if statement so i might say something like for example if word is equal to python colon and then enter and this is where for the first time we're seeing the importance of essentially indentation and white space in python so all the white space that i was using with operators is really just to make it more legible but this white space here which is one two three four spaces you need that for this to work so in an if statement i say if and then i have a condition that returns a boolean followed by a colon and on the next line indented by four spaces i'll have whatever action i want to take so i might say print code and so here for example because word is equal to python this statement does get executed so let's look at container lot values again and we've seen that we can check membership so this is just repetition of something that we've seen earlier so now we can use this in our conditional statement so or this kind of thing so i can say if python in container values print code otherwise print no code so is greater or equal to 0.3 then i'll say reservoir is equal to good then i can have alif which is going to be elsif let's say 0.15 is smaller than porosity and that is smaller than 0.3 so you see how we can chain these conditionals here i could say reservoir is medium for example and finally i can have an else when i say reservoir is poor if i now look at retire given that i've chosen a porosity value of 23 this returns medium as i'd expect now this construct will see this all over and it's not always an if l if else you could have multiple if statements without an else and in that case every if statement that returns true will be executed it really depends on your particular use case so now what i want to look at with all of these things that we've learned so far we've seen so far is looping in python so repeating ourselves and one of the main advantages of computers is the ability to repeat these tedious tasks where we would make mistakes for sure so first i'm going to load some data and for that i'm using numpy and as i pointed out earlier there are ways standard ways to import these things so you should always import numpy with an alias which is np if you run this or if when i run this there's no errors there's no output that's because numpy is inside of my environment in my environment.yaml remember we had numpy so that means i can do this and then i'm grabbing some data from the data folder inside of this repository and i'm going to do a couple of things first i'm going to print the shape of those data so how many rows and columns we might have those kinds of things um and then i'll say the n phi is just going to be this all the rows and this column one and print the shape of that but you see that we have four columns gamma ray and phi will be in dt and 71 rows and when i grab n5 i'm grabbing all the rows that's what this colon is and we've not seen this before but when we're in numpy and we have multiple dimensions you do indexing and slicing exactly like you do in strings and lists however you have a comma between each dimension now we don't need to put all the other dimensions because we're not doing anything with them so now if i look at n phi we have 71 values of porosity in this case so what can we do with this well let's look at a loop how do we loop over this well let's say for well first of all let me show you that does not exist so poro i get a name error because name porrow is not defined now what i can do i can say for poro in um and fine for example and i'll only use the first five and five values i don't want to use all 71 of them then i can say print horror and so here you see the one line after the other we're printing each value in the n fine now it's worth just pointing out that up here we've only got four decimal places whereas down here we've got the full precision that's simply how numpy is displaying data the data on the under the hood are exactly the same they're just being displayed slightly differently so now let's look um let me delete these cells let's look at a few lines of code and i'll run them and then we'll talk about what we're looking at here so we're importing a new library matplotlib which i mentioned this allows us to make um simple plots quite quickly and quite easily i will say that matplotlib can become quite verbose um but you know it's it's really good for quick and um yeah quick data exploration so the first thing we're doing here is we're creating a plot with n5 that's giving me this blue line then we're adding these two horizontal lines two red lines uh 24 and 22 percent and we're adding a couple of labels and we're adding a grid in the background with an opacity of 40 percent and so that gives me this nice little plot here that's relatively easy to make so so there's a few more things i want to show you before we go down here so remember we have i'm going to copy this over because i don't want to retype all of this we had this porosity um with an if alif else statement well what i can do now is i can combine this with my loop so i can say for poro in n5 i'll just do the first five again and i can grab this code and paste it in here i just need to make sure that everything is indented correctly and then at the end i can add a statement where i say print reservoir and then we find that all of these are medium value or medium porosity or medium reservoir quality in this particular case now what we could do is we could start counting these so i might do something like good reservoir counters is equal to zero for example and then for poro in n5 let's grab the first 10 here and by the way i'm just using the first five the first 10 while i'm prototyping and then when i'm happy with my code i can run it on everything so for frosty in or for poro in n510 i can say if poro is greater or equal to 0.24 for example say good reservoir plus equal one so what does this plus equal mean um another way to write that would be to say good reservoir is equal to good result plus one but this is too long to write so instead of that we can the hashtag here is a comment so that means that this line of code six will not be executed so instead of running this i can simplify that and instead of having item equals item plus one you can do item plus equal one and you get the same result and at the end if i look at um good reservoir account we see that in these 10 first values we've got seven that are good with this particular cut off okay now if i run this same code with everything so without filtering out i can see that in fact i have 20 rocks uh of good quality this process of building up your code on a small sample is something that i'd recommend so that you're not running your tests or your your prototyping on very large data sets straight out of the gate now i will just show you that you can do this with one line of code essentially and i'm showing you this really just in case you see it somewhere else so there's a structure called the list comprehension and so what you can do is i could say for example one for poro in n5 if poro is greater than 24 and then i could sum that and if i look at good reservoir lc you see we get the same results as with this loop that's because we're doing the same thing here on this one line now i will say this is something that beginners really should not be concerned about however if you see this kind of thing you should know that this is nothing more than the loop so there's one for poro if you see that that's a loop and then everything else that's going on there you can basically do like this but you'll see both of these so i think it's useful to at least have seen it once okay so i'm watching the clock here as well so i'm going to go back into matplotlib so the python the python visualization landscape there's a link over here um to which he is i can't to his um his repo here and the python visualization landscape as he illustrates here is very broad we're only talking about matplotlib over here if you want more interactivity there are some little tricks that you can do i'll show you something in the notebook in a few minutes but if you want really fancy um powerful interactivity you're going to have to go towards d3.js javascript and so on and so forth but you know for most data science and data exploration you'll be more than happy with maplotlib and some of the associated libraries here so um it's worth noting that as i showed you in a few cells above you can write your own code here your own plotting code but a really good place to go is to go to the matplotlib library to their examples and you can access those directly from the jupiter notebook if you go to help and go to the matplotlib reference uh okay looks like they have to update that um if we go to map league.org looks like i've got a broken link or there's a broken link in um in the reference there but if you go to matplotlib.org and then go to examples you see that there's a large collection of pre-written plots that you can choose from and i don't know if you wanted something like a scatter plot with histograms if you click on there you then have basically the code that you need to create that plot so let me illustrate that if i was to copy this to clipboard come back here and then make a new cell and paste that in there oh hold on i need this code as well i need more than just the function and there you go now i have the same plot as i did in the documentation and without knowing anything about python you can still come back and have a read of this and really what you want to change is you want to change x y to be your data you don't need to worry about all of this if you're just wanting to plot x versus y with this kind of output so in general i'd recommend starting from the library starting from the gallery and modifying your code to match what you're aiming for so let's go back to our simple example so again i'm just reiterating these imports so that you see them and you absolutely don't need to do them more than once but i think it's useful to just see it over and over again and we've seen that we can do simple plots like this plt.plot of n5 for example um oh and fine is not a string sorry data and fine not in find the string now you can start to embellish this plot a little bit if you say plt.plot of n-phi and then we pass additional arguments so for example i can say i want o dash i'll show you what that is in a second i can say c is equal to g like this and now you see that the o dash is giving us these um these circles at the data points c is the color of the line g stands for green no surprises hopefully now all of this is fine um but it's what's called declarative plotting so when we do plt dot plot and here i could say plt dot grid alpha equals 0.4 um glt.label plt.x axis label and so on and so forth there's another way to use matplotlib which is much more robust and you'll see in the documentation and that follows this pattern where we'll say fig commax equals plt dot subplots for example this will set up a figure and an axe in matplotlib so the figure is basically the canvas that we're drawing on and each ax is going to be one plot so when you see plots with for example in the the installation markdown i showed you um logs and a map and a cross section that's one figure with three axes in it so now what we can do is we can say the same kind of things we did with plt.plot i can say um x dot plot of n phi with the same kind of thing here i might say the line width i want that to be four i might want to mark the size to the two so if i run this let's see what i have um and phi oh not four it's not point four i'm looking a bit too thick there we go line width of 0.4 mark size of 2 so it's just slightly thinner than what we had here you might want to do x dot set y label and now we do need um quotes because this is a string um x dot set x label and this is just the sample number for example um x dot set title and this might be a simple rossity plot and again i might set a grid so axe dot grid with an alpha again the alpha is the opacity of the grid and so now with just a few lines of code i've got a plot that's slightly nicer and of course i could save this to this if i wanted to which i'll show you in the next cell so now we've created the plot i want to pull everything in the same cell so you see that with fewer than 20 lines of code if i run this i import numpy i import matplotlib i load some data i grab or i reach into that data and get only the n5 but if you wanted you could do row b dt or whatever i create a plot with all the labels and access that we've seen so far and finally i save a figure so if i go back over here to images see that a second ago i saved n5.plot png which is our little png plot here which you could then include in an email a document anywhere you wanted so hopefully that that illustrates what we can do with not that much code um okay so now i'm going to show you another cell where we've got some interactivity so well let me just stop here before i show you all the code so we can add some interactivity using ipi widgets and the documentation there is pretty completely very good actually um and it'll show you how you can add these widgets so sliders of different types range sliders progress bars text boxes and so on and so forth and so what i've got down here i know it's quite a lot of code so let me run that first i'll show you what it does and then i'll explain the code so we've got the same plot as we had before but now we're coloring things based on which window they fall in and we've got these little sliders here where we can say well actually i want you know i want to extend the range of what's medium so i now only have one high value perhaps perhaps that's too high perhaps i want to be down here and likewise i can change my lower bound and you notice that the markers are updating the markers are changing depending on which family they fall in the colors changing all kinds of things are changing so how do we do this well you see there's not that much more new code relative to what you've seen so far so these two lines are additional we need them to add the widgets that we've got here this is brand new you've not seen this before and in fact this structure is new as well we'll talk about that shortly but essentially what we're doing here is we're saying we want to have these different float sliders available to us and then we pass the values given by the user into our plotting code now don't worry too much for now about what's going on here but the plotting code is essentially the same as we've seen so fig x and phi here there's some boolean values this is a little complicated and i'm not going to explain this now but essentially what we're saying is that if the value that we've got is greater than high then we're going to pass this particular up arrow we'll change the color to red and so on and so forth so that's what those things are the fill between that's what's giving us this green fill down here and then everything else is stuff we've seen already the only new thing is ax.legend which gives us this uh legend over here okay the legend we're getting because we've got labels so that's what's being passed in so we've got label high and label low and so that's why we see these here now i appreciate this is way too much code for someone who's never coded in python but what i want to illustrate is that with you know less than 30 lines of code you can have a nice little interactive plot um where you can you know test things and visualize things now as i said if you want more complex visualization tools in python you're going to need to reach to essentially go towards javascript okay so the beauty of code is you can reuse it now we've used functions already we've just seen one in fact just above let's have a look first of all print so the type of print is a built-in function or method we saw some once and the sum you can actually pass what's called an iterable so you can pass a collection or a sequence and you'll return basically the sum of it so we've seen that um we've seen plt.scatter so i don't know plt well let me show you for example if i do if i've got a bunch of x's um one two three one two three four five six seven eight nine ten if i'd say x squared is going to be well i'll use a less comprehension here remember this is just a loop if i say x squared for x in x's then maybe i'll look at x2 here so you see we've got the square values of x's well then i can do plp dot scatter this is also a function and say give me x's versus x squared in red for example so this is also a function that we're using and we're passing different values to it now we can write our own simple functions so for example i can say my i want to create my adder that's going to add two numbers to do that we use the keyword def that you've seen once before in that interactive plot and then i can say my adder this is the name of the function and it's important to know two or three things it starts with a lowercase letter and it has no so a lowercase character it has to be a letter and there are no spaces in the name so that's why we've got an underscore there then after the name of the function we'll have parentheses and inside the parentheses we have the arguments or the parameters of the function so here i might just have a and b i have a cola and in python every time you have a colon you have an indented block so that looks like this and then i'll have a doc string and i'll show you why we need that so a doc string is going to be a string enclosed in three single or double quotes that simply allows me to have multi-line strings i can say add two numbers and then i can say args i don't know um [Music] [Music] something like this i'll show you why we're writing that and then i can say return a plus b now if i execute this cell i'm defining that's the def keyword so i'm defining this function so now i can do [Music] the shift tab trick that you saw before or you can pass a question mark and that immediately gives you a little pop-up window at the bottom of the screen and you see that we've got the little documentation that we wrote here so that's something that you should always do when you're writing your own functions to make sure that you've documented what it takes and how it works okay so now we can actually use it so i can say my adder and add two two values and you get back the result that you wanted so that's the basic syntax now i just want to show you how we'd write an actual function in python so you may remember this equation from the start the gardner relation so if you wanted to write this we'd say gardner and here we actually want vp however we want alpha and beta to be parameters that the user can change so i'm going to jump ahead a little bit here but what we can do is we can say alpha is equal to 310 and beta is equal to 0.25 squared sorry colon and then i need a dot string so i'm going to grab that from another screen so you don't watch me typing all this stuff so here we're evaluating gardner's relation the arguments we have are vp which is a float p wave velocity alpha and beta are float-like so they can be ins or floats alpha is the scalar and beta is the exponent and we return rho uh as a float which is the density and i've also added a source over here to the subsurface wiki which i recommend you check out and so then what we can do is we can say return alpha times vp to the power of beta if i now execute if i call this cell um i get back a value that you know i would expect so writing functions is something that you should always do in python once you prototyped and hopefully what i've shown you in this little quick tour of python is how you can take the syntax of python take basically build it up into conditional testing into loops so to reiterate or to repeat code you can make plots you can write functions that then you can you know add interactivity to if you're plotting and you can reuse those functions um in all kinds of places in python so coming to a close in this little intro i just want to point you to a few places where you might want to go next um useful and approachable documentation there are three main places i'd suggest you go um well for i suppose python.org there's a great tutorial section there the matplotlib gallery which which we touched upon and w3schools python the fourth place of course is transform so you've got you're going to have this year a great series of tutorials but there's also previous years where great introduction and advanced tutorials were made and they're on youtube for you to use um if you're more into books i'd suggest these three books so crash course in python automate the boring stuff and learning python for cheat sheets which i know a lot of people love there's a bunch of links here which i'd recommend you you check out um pure python cheat sheets and then some more geoscience focused uh cheat sheets that i'd strongly recommend and then of course there's all the transform youtube tutorials so there was a preparing for transform 2020 with setup guides for windows and linux which we kind of covered today but perhaps there was more detail there um there's a playlist of learning python for geoscience available to you on youtube through the software underground and of course there's all the content from transform 2020 and transform 2021 now the last point there is probably the most important in my mind is your own awesome project so the best way to learn anything is through practice and so hopefully now you've got at least some illustration that with not much code you can get some results that you can then share with uh with others so with that i'm going to stop sharing and uh close out this um this little session here i'd like to thank everyone who's watching this live anyone who might be watching this in future um and for those of you who are watching live i'm going to go into the related slack channel to answer any questions you may have okay thank you all you
Info
Channel: Software Underground
Views: 2,406
Rating: undefined out of 5
Keywords: conference, python, tutorial, geoscience, subsurface
Id: wF9ZlPOCwIc
Channel Id: undefined
Length: 88min 58sec (5338 seconds)
Published: Mon Apr 25 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.