LangChain Crash Course For Beginners | LangChain Tutorial

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Langston is a framework that allows you to build applications on top of llm or large language model in this crash course video we are going to go over all the basics of Lang chain and then we will build a restaurant either generator application using streamlit where you can input any cuisine Indian Mexican Etc and it will generate a fancy restaurant name along with the manual items first let us understand what is Lang chain and what kind of problem does it addressed when you are using chat GPD as an application internally it is making call to open AI API which internally uses any llm such as GPD 3.5 or 4. in this case cat GP itself is not an llm it is an application whereas GPD 3.54 these are large language models now let's say you want to build an application for restaurant idea generator where you give a Cuisine and then it will generate a fancy name such as Taco Temptation for Mexican and menu items as well let's say you give Indian Cuisine it will say Okay Curry Curry Palace or Sahara Palace for Arabic along with these menu items so this is a sample application which we are going to build but this is an llm based application and for this we can use the same architecture as jet GPT where we can directly call open AI API and here I have provided a screenshot of their main API so you can call it and you can get a behavior similar to chat GPT internally it will use GPD 3.5 or gpt4 model in this case once again restaurant idea generator is an application similar to chatgpt but internally you are using open AI API and llms now there are a couple of limitations of following this approach and by the way uh the reason I'm telling you this is nowadays there is a big boom in the industry where every business wants to build their own llm you would think why they can't use chat GPT because GPT has no access to your internal organization data so people want to build applications which are based on llm okay so there is a clear demand and clear boom in the industry for this and why do business not use this kind of architecture well there are a couple of things to consider first of all calling open AI API has a cost associated with it for every thousand token they will charge point zero zero two dollar or something you can check open AI pricing page but there is a cost associated with it and if you're a startup who is having funding issues and you know your budget is limited this is going to be a bottleneck for you another thing is you might have noticed chat GPT doesn't answer latest question its knowledge is limited to September 2021 as of this video recording so if you want to incorporate some latest information let's say from Google Wikipedia or somewhere else you can't get that here the other issue is atlick is my own software development and data science company if I want to know how many employees joined last month chat GPT can't answer because because it doesn't have access to my own internal organization data so if you use this kind of architecture for building your application you will hit some roadblocks or you will rather have some limitation and look open AI guys are pretty smart actually if they want they can address all of this but their stance is very clear we will provide foundational apis and building framework is something that other people should do and that's what happens see if you have just open AI API it is not enough to build llm therefore you need some kind of framework where you can call open AI a gpt3 gpd4 or maybe if you want to save the cost you call some open source models such as hugging phase Bloom there are so many models out there let's say you want to use them you don't want to spend money on open AI gpt3 model then this framework should provide that plug-in Play support you know where you can integrate to one of these models and your code kind of Remains the Same this framework should also provide integration with Google search Wikipedia or even integration with your own organizational databases so that the application can pull information from these various sources as well and this framework is launching that is what it does it's a framework that allows you to build applications using llm okay let's install line chain now and uh do some initial setup let us first create an account on open AI you can go to open a website click on login and create a login using Google or individual email credentials and once your login is created you will come to our dashboard so let me just show you so you go to open AI say login you're logged in click on API and then from your account you can go to your manage account and API keys you will find a key here which will look something like SK hyphen something that is like a password so you need to use that key in our code for line chain you can also create a separate keys for separate projects so I have some client projects going on YouTube tutorials so for each of them I have a separate key in your case you can just use the one key whether you can generate a new key here as well so let's say you copy that key to some secure place after that you won't be able to access it here so you have to delete and create a new key okay so let's say you have that key ready with you uh here then you can just import OS model and then in OS module you can create a environment variable with that particular key if your key will be SK something okay in my case I have stored that key in one python file okay I don't want to share that key with all of you that is the reason and that python file looks something like this you know secret underscore key dot Pi uh it will have my own internal key I can have n number of keys here and I'm just importing that variable here and just setting it here Ctrl enter so that thing is set now let's go to the terminal and install some modules so you're going to install Lang chain model that's number one and the second module you are going to install is called pip install open AI once you have installed those modules let's now import few important things from Lang chain uh we are going to import the llm called open AI now we are using open AI because open AI I know it costs some money but it is the best one uh if you want some other ones then just just hit Tab and it will just show you hugging phase whatever the the other type of whatever other llms that it has available it will show you all of that we are right now happy with open Ai and then I will create my open AI model it has a variable called temperature now what temperature means is how creative you want your model to be so if the temperature is set to let's say zero it means it is very safe it is not taking any bets but if it is one it will take risk it might generate wrong output but it is very creative at the same time I tend to set it to 0.6.7 things like that and now in that llm you can pass any questions so let's say I want to open a restaurant for Indian food and I want some fancy name for it I am not able to come up with that product name idea or restaurant name idea and let's see what this guy does and I typed in the same question in here also see I want to open a restaurant for Mexican food it told me this if you say Indian food it will tell you something else so we are using essentially the same concept here okay so here uh it says okay maharaja's Palace Cuisine uh if you say Italian food see the name sounds real as if it's an Italian restaurant so we have imported that open a class which created an llm and in the llm we are just passing a simple text now I don't want to keep on changing this same string so I will now go ahead and create something called a prom template so from Lang chain dot prompts you can import a prompt template and in that prompt template you can pass some variables such as input variable what will be the input variable by the way variables it will be a cuisine and then the template that you want looks something like this so I'll just copy paste here and what we are doing is just changing that Italian Etc with that Cuisine variable and this template is called prompt template and let's say this is for a restaurant name that's why I'm saying name here uh and once that template is created you can just say prompt template name dot format and in that format you can pass cuisine as let's say Mexican and see I want to open a restaurant for Mexican food if you say Italian it will say Italian food this is more like a python string formatting you would be wondering why you don't use Python string formatting well let me just show you that using something called chain so we are going to use this concept of chain in line chain and it is one of the most important objects in in language in framework you can figure out from the name of the framework itself and we are importing llm chain and llm chain is essentially a very simple object where you are saying my llm is this whatever you created here okay that is my llm and my prompt is this prompt template and this is my chain and in the chain you can say chain dot run let's say I want to open a American restaurant see the All-American Grille and Bar so now here I don't have to pass the whole sentence I want to open a restaurant for this is that I just passed the cuisine the variable and it will just work every time Mexican see so internally it is calling open AI API and we made that connection via this module here so if you are using hugging phase then you'll have to do the hugging phase setup and it will call hugging face okay here we are kind of paying that cost but by the way uh when you created that open API account you got five dollar free credit so you should be okay five dollar is more than enough for initial learning and exploration and after that if you like it you can go ahead and pay money so that is the simple chain that we got here now let's look at something called a sequential chain uh so let me just explain the concept first so far what we did is we had this Gene for generating restaurant name and it was generating it but let's say for that restaurant you want to generate a manual item four manual items so you can have this second component or a second chain where you pass restaurant name as an input and it should give you the menu items that you should include in that restaurant so if it is Indian restaurant it will say panitica mangolis Etc if it is Mexican it will say quesadilla burrito things like that this thing is called Simple sequential chain and let's uh code that up but just to clarify the idea here you have one input and one output and you can have intermediate steps where the input of the second step is the output of the first step it is as easy as that so here uh once again I'm generating everything from scratch I have generated the name chain the same way that we did before this is the exit copy paste of the previous code so nothing fancy here and then we are going to create another chain and I'm just copy pasting just to save your time where the input is restaurant name and we are saying suggest me some food menu items for restaurant that so it is like saying this see uh you're you're saying I want to open a restaurant for Indian food suggest a fancy name for this only one name please so let's say you talk to chatgpt in chat GPT generated this now you are saying that generate food menu items for saffron spice and then return it as a comma separated list see this this is what you want you you want to generate all this list and you have now two chains which we are going to do control enter and execute this code and now from link chain from Lang chain dot chains it will show you all kind of chains you are going to import a simple sequential chain okay and that simple sequential chain will contain this individual chains that we created and by the way the order matters here so this is a restaurant name chain and then you have a full item chain and that's it and now you will say chain dot run let's say you want to generate it for Indian food you're getting a response here and you print response sometimes it takes time so you have to wait for few seconds but it will generate the menu items vegetables yummy huh you're probably getting uh water in your mouth let's do Mexican for all those Mexican food lovers so while this chain looks good it is generating those food menu items uh I am not getting the restaurant name as such because say for Mexican food it is saying all these item but what is the restaurant name that was that intermediate step here that was the intermediate step but in the output in the simple sequential chain it gives you just one output but I want the restaurant name and the menu items both for that we have to use a different chain called sequential chain and this sequential chain can have multiple input multiple outputs so I can just say okay give me a name for Indian restaurant which is vegan and then in the output I can say give me a restaurant name and items both as an output all right so let's try something like this here I think this whole code kind of Remains the Same I'm just going to add one or two extra things here so see this is my first chain and the extra thing that I have added is output key so the output of the first chain is the restaurant name and the second Gene looks something like this where the output key is manual items okay and now let's create the simple sequence in chain so I am going to say from Lang chain dot chains import sequence searching we only used simple sequential chain this is a sequential chain which is little kind of generic so sequential chain will take what kind of parameters will it take well first of all it will say chains and these are the two chains I have and then my input variables you can specify all the input variables so input variables I'm not including that wagon Etc let's keep things simple all I want is to Output the remaining things are same okay so in the output variables I should say that I want a restaurant name and the menu items as my output variable let's call this a chain and by the way when you run this chain you can't just say Mexican because you might have multiple input variables that's why you need to give a dictionary you will say Cuisine is let's say Arabic run not supported okay run is not yeah so run is not supported you have to just call it just like that no function just chain and then back it just give the argument then enter and see how much with pita bread Falafel and the name of the restaurant is the Arabian Bistro so it's giving in fact it's giving input as well so it's giving input and both the outputs whatever code we have written so far we will use that code and create extremely based application for restaurant name generator I am in my C code directory I have created this empty folder called restaurant name generator you see there are no files here and I'm going to Now launch pycharm which is a free python code editor and in this you can select open and you can open that particular folder so I will go to my C code directory and in that I will locate restaurant name generator hit OK and it will create a I think empty main dot Pi file and I can just remove the content here and I'm going to just import a streamlit library first now if you don't know about streamlit it is a library that allows data scientists to build POC application proof of concept application simple applications very very quickly you don't have to use front-end Frameworks such as react.js Etc this Library will allow you to do all of these things very very fast so let me just show you so you can just create a simple application with a title rest to run name generator and by the way you have to do pip install pip install streamlit before you start using it otherwise you'll get an error so make sure you have run that and this is the simple app with one title now I can go to terminal and I can just say streamlit run main dot pi and it is going to open up an application in my browser see simple application with this particular title now I can create the Picker where you can pick the cuisine and for that I'll use the sidebar so in streamlit there is something called sidebar where you can create a select box and give a name to that box so you will say pick a cuisine and give all the options that you want in that particular drop down so I will just put bunch of Cuisines Indian American Mexican and so on and then here you will say if or let me just show you how this looks so just say hit Ctrl s save and click on rerun you can click go here and rerun or just press R key and it will show you see you get this kind of nice picker and if someone picks any entry let's say someone picks Mexican what's gonna happen is that call is going to return that value in a variable which will store in this Cuisine okay so I will just store it in this particular variable and you can say if Cuisine then do something okay what do I want to do I want to generate the restaurant fancy restaurant name and list of manual items here for that let me just write a a dummy code here so I will just call that function let's say get restore name and item where you supply the cuisine as input and it returns let's say rest or a name foreign it is always a good idea to write this kind of stub function or empty function so that you can check your wiring and then you can write the actual code in that function so let's say my restaurant name is curry delight and my menu items is whatever items you like okay so if if Cuisine then get those things as a response and then from the response let's say the restaurant name I can show it here on the right hand side see here somewhere below this header I want to show it and I will use maybe let's say St dot header I will use St dot header as a UI control so let's do this S3 dot header and when I get the menu items so let me get the menu items here so the menu items are going to be this menu items here and whenever you have comma separated string or you can call this split function and specify the separator which is comma here and this should return you a list and once you get that list you can iterate over that list and you can write those items here [Music] um I can just say item and maybe I can put some character just to indicate this is an item you can also uh all right like kind of like a header where you'll say okay these are menu items manual items okay hit save this code is ready you go back to your UI rerun and see you're getting the restaurant name and the menu items now when you change this selection it's not going to change because obviously we are returning the hard-coded response and the next step is to write that code which we wrote in our Jupiter notebook and put that code here okay and since I like to modularize my code I'm going to create a new python file let's call it Lang chain helper and in this file I will copy paste this function and here you can import that module and you just simply call that here see now let's focus on line chain helper so what do we need to do here well the same thing that we did in our notebook so I'm just going to copy paste some code from my notebook here okay we don't need to go over it because we have already written that code I will also create a file for my secret key so I will call it secret key and in that secret key file I will place my open AI security now I'm not going to show you my key because of course it's private thing but you will type whatever key you got remember you got five dollar credit so you can do a lot of things with five dollar is more than enough uh so put that that thing here and then you are using that key you are importing that variable from that python file here directly okay so my key is ready what else do I need to do well again copy paste business folks copy paste is a boon for any programmer or a data scientist so we just copy pasted the code for the sequential chain that we wrote in our notebook see here we create a restaurant name chain here we create menu item chain and we just return this response folks this is this is so straightforward okay and I have this habit of creating this function main function just so that I can test it so I will say if name is underscore underscore main uh then print generator store name let's say Italian okay and now what I'll do is I'll pause the video and put my real secret key here all right my secret key I have placed it and now I can run it and see what happens so we are generating Italian food restaurant name in the menu items perfect so the restaurant name is La Dolce Vita whatever and menu items are Margarita Pizza alfredo lasagna and all that one thing I'm noticing here is I see some extra slice and characters here so maybe we need to remove them so the way to do that would be let's go back to our streamlit code and instead of just saying responsible restaurant name we can call this stripp function that will remove the leading and trailing white spaces including those slash and characters and you can use that same thing here as well before calling split so hit Ctrl s save let's go back rerun dental and see Italian food this is my restaurant name menu item you can change it to Mexican change it to whatever just play with it and folks The Art of Getting skillful at coding is to practice just by watching this video you're not going to learn it so make sure you're practicing while watching this video all right our streamlit application is ready as a next step we are going to look into something called agents which is a very powerful Concept in Lang chain and by the way all the code that we are writing check video description we are going to give you all of that code if agents is a very powerful Concept in Lang chain what happens when you type this in chatgpt when you say give me two flights options from New York to Delhi on a given date obviously it won't be able to answer because it has knowledge till September 2021 but if you have chat GPD Plus subscription there is this thing called plugins and I have installed those plugins especially the Expedia plugin Expedia as a website which helps you find tickets and when I give the same question now with the plugin enabled magically it will start working so it will go to Xperia plugin try to pull the information on the flights for a given date and given source and destination and then it will start typing those two options see number one option is only I think 500 ticket 518 ticket which is a pretty good deal by the way actually I think I should book it for my next India trip and there is the second option and it will give you a link where you can go and book those tickets on Expedia so what exactly happened when we enable this plugin let's try to understand that so when you think about llm many people think that it is just a knowledge engine it has knowledge and it just try to give answer based on that knowledge but the knowledge is only limited to September 2021. the thing that we miss out is it has a reasoning component so it is a reasoning engine too and using that reasoning engine it can figure out that when someone types this kind of question see when as a human when we look at these questions what do we think let's say if we have to go to Wikipedia or not Wikipedia Expedia and if you have to type uh convert this question let's say if your friend asks you this question and let's say you are that reasoning engine you go to Expedia and in the source you will put New York destination you will put Delhi date you will put first August how can you do that because you have that reasoning engine in your brain similarly llm has a reasoning engine using which from that sentence it will figure out source is this destination is this that and it will call the Xperia plugin and that will return the response back let's look at some other question when was Elon Musk born what is his age now in 2023 now maybe this can be answered by the llm's knowledge but let's say you are asking some question which is related to an event which happened in 2022 now this guy doesn't have knowledge after September 2021 but once again it has a reasoning capability so it will say okay in order to answer that question first I need to find out when was Elon Musk born for that it can use things like Wikipedia so agents essentially do this thing agents will have tools and using that tool it will try to fetch the answer so Elon Musk was born in 1971 and then there could be another tool which will tell you 2023 minus 1971 how much is that so there is a math tool that it can use to compute that and it will in the end say Elon Musk is 52 year old so this is what agents are agents will connect with external tools it will use llm's reasoning capabilities to perform a given task let's look at a different question how much was U.S GDP in 2022 plus Phi I am doing just like a it's a it's a silly operation no one cares but usgd pin 2022 llm doesn't know because its knowledge is still 2021 so it will go to Google it will find that answer and then it will use mat tool to do plus 5. all these tools like Google Search tool map tool and Wikipedia tools are available as part of langchin and you can configure your agent so your agent is nothing but using all these tools and llms reasoning capability to perform a given task that is your agent and this agent can be used in our Jupiter notebook so that's what I'm going to show you next so let's first import couple of important modules and classes and once I've imported them I will create tools so I will say load tools and I will give list of tools now if you do Google search On Tools here so let's say if you do Google Search link chain agent load tools you will come here you will see list of two see I have Wikipedia as a tool I have twilio I have all these tools that I can use so we are going to use Wikipedia tool here it is called Wikipedia we keep it yeah and the math tool is called llm math and here you need to provide the llm variable is the one which we created above somewhere here see this is the variable okay so this thing is called tools and then you can create an agent using this initialize agent method okay so initialization method will take tools it will take llm and it will take agent and in the agent I will give this 0 now see hold on zero short react description react means uh thought and action so when we are reasoning we first have a thought then we figure out where to go and we take an action so it mimics that particular concept here I will call this an agent and then I will ask the question agent dot run when was Elon Musk born and what is his age in 2023 so let's see what this gives us see perfect it says 52 year old in 2023 if you want to go uh step by step in the reasoning process you can say verbose is equal to True uh and it will tell you by the way verbose is equal to True is the variable that you can use here in any function to kind of figure out the internal steps that it is taking so here the first step when it Encounters this question it knows actually that it has to go to Wikipedia to get the birth date of Elon Musk so it went to Elon Musk Wikipedia page and which will have this particular date here and then um I think it uses the matte tool sometimes I don't know it should have used the math tool uh Elon was um it doesn't show but if you don't rerun it again okay I don't know why this is not working but previously I was seeing that let me show you a previous snapshot that I have um here it says okay went to Wikipedia for Elon musk's birth date and then it use the action as a calculator so there it is using the llm math tool and it is just calculating the final answer it is saying it is 52 year old let's try a different option so this time we are going to use a serp API so if you don't know about serp API it is Google search API whatever you do in Google and what results it gives you if you want to access those results programmatically you can use this particular API you can log in using your Gmail account I have already logged in and when you go to dashboard it will give you this API key so this is similar to our open AI API it will be a big big string I have stored that API key into my private file that secret key file that I have and I'm going to initialize some environment variable so for serp API you need to initialize this variable and this is the keys which I got from there okay so you can just copy paste this key if you want to keep things simple I don't want to show that key publicly here that's why I have this thing here and once I have this thing the next steps are kind of similar so I can just copy paste pretty much everything here and I will just say Okay initialize the agent sir BP and llm math are the two tools I am going to use and in my agent I will say agent dot run and what was the US GDP in 2022 and plus five okay and while it is executing it let's do Google here so when you do Google usdp 2022 it will tell you 25.46 so this serp API this API will do Google search and it will tell you the answer that it is this so check this so first it is searching US GDP this and then it is adding Phi number to this and it is giving you uh this particular result we'll talk more about agents in our future videos uh one thing I've noticed is agents are not perfect sometimes they give stupid answer this whole thing is evolving so in the future it will get better but for agent this is what we have and now we will talk about memory when you look at any chat board applications such as chat GPT you will notice that it remembers the past conversation Exchange here I asked who won the first Cricket World Cup then a it's totally irrelevant conversation what is five plus five then I am asking who was the captain of the winning team now see here I did not say which match which game cricket football Etc but it remembers that I'm talking about cricket and it is giving me a relevant answer same thing happens with human conversation we start a topic then we keep on saying things but we remember what the topic is about if you look at the llm Chain by default These Chains do not have memory they are stateless and if you look at the available methods that this chain has you'll find that it has an element called memory here see memory uh and if you check the memory so here if you try to print the memory see you don't get anything because the object is set to none now if you want you can attach memory to it so for memory you have to create an additional object and attach it so that it remembers all these conversations this is useful especially if you're building a chat board let's say for your customer care Department uh in that many times they need to save their transcripts of those conversations for legal and compliance reasons so here uh I'm going to import an object called conversational buffer memory which is a very common type of memory in line chain module and create an object of that class so I will just say this is a conversational buffer memory and then the same chain I will print here but I will just pass memory as an additional argument okay and then I'll run the same chain one more time for a different question now when you look at chain dot memory see there is a memory attached to it we explicitly attach this conversion memory to it and if you look at the buffer and if you just print it for nice alignment Etc see human then AI this human and a address now this looks good you can save this into your database as a saved transcript of your customer service center conversation uh but one problem with this particular object which is conversation buffer memory is that it will keep on growing endlessly so let's say you have 100 conversational exchanges and by that what I mean is one question answer pair so one question answer pair is one conversational exchange this is second so in total this is two conversational Exchange so here if you have 100 conversational exchange what's going to happen is next time when you ask a question to open AI when you say chain dot run it is going to send all this past history to open Ai and open AI charges you per token so this is one token second token third four and so on for thousand tokens they charge like point zero zero two dollars something this is on the model so your cost is gonna go up so if you want to save the cost and kind of do things in an optimized way uh you need to restrict this buffer size you can say just remember last five conversational exchanges okay and this thing can be done using something called conversation chain so open AI link chain provides this conversation chain which is just very simple object so let me just create that conversation chain where you can just pass llm is equal to open AI temperature is equal to 0.7 and I'll just call it convo and let's check the default prompt that is associated with it see default prompt is this let me just check the template let me just print the template associated with this this is the default template that comes and it says that the following is a friendly conversation between human and Ai and there is history and there is input so if you look at this conversation window here there is a history this is the history and the next question you're going to type is the input so input history same way input history okay now when you ask a bunch of questions to this so I'm just going to copy paste those questions here uh what is five plus five and then who was the captain of the winning team now when you do convert memory it won't be empty because by default conversation chain has this memory associated with it so by default this conversation chain object comes with inbuilt conversation buffer memory and if you print the buffer you will see the entire transcript of our conversation now while this looks good once again think about the open AI token cost if this keeps on building the buffer endlessly then you might have 5000 tokens in one conversation and when you make a next call like this is gonna actually send an entire history entire conversation to open Ai and that will increase your bill on the API call to tackle that problem maybe what you can do is you can say just send only last 10 or 20 conversational exchanges because that's what I care about that might be enough based on the use case that you're dealing with and for that there is an object called from Lang chain dot memory import there is a conversational buffer window memory you are restricted in the window you are saying let's say my window and key is the parameter you're saying just remember only last one conversational exchange which is one question answer pair okay so let's try this out and let's see how this goes so I'm going to copy paste some code here create a same conversational chain object asking my first question asking my second question now when I ask the second question here what is 5 plus 5 it remembers the previous Exchange but when I I asked the third question here it remembers only what is 5 plus 5 it doesn't remember this it's like a short memory loss like Memento or Disney movie it just forgot what happened here so when I asked this question it will say I'm sorry I don't know because it doesn't know which game you are talking about which particular match you are talking about okay so here I know it's probably not the best example but the idea is I wanted to demonstrate this K parameter here and based on this use case uh you might see benefit in using conversational buffer window memory that's all we had for this video in the future we are going to build end-to-end llm applications using some of the advanced features such as retrieval QA chain the face or DB store Vector store uh things like that so if you want to get notified about upcoming llml line chain videos what you can do is subscribe to the channel or click the like button if you really found some benefit from this video making these kind of videos takes a lot of effort and clicking that thumbs up button or subscribing or sharing this video might be a smaller food for you but it would mean a big thing for us it will help us get this video to more people so that those people can also benefit and we also get some appreciation some reward of the hard work that we are putting and if you have any question Post in the comment box below the code link is given in the video description thank you for watching bye bye thank you
Info
Channel: codebasics
Views: 60,141
Rating: undefined out of 5
Keywords: yt:cc=on, LangChain, LangChain Crash Course, ArtificialIntelligence, Artificial Intelligence, Deep Learning, Natural Language Processing, LangChain Tutorial, LangChain Simple Explaination, LangChain Easy Explanation, Large Language Model, ChatGPT, langchain tutorial, langchain explained, langchain agent, langchain demo, langchain 101, langchain tutorial python
Id: nAmC7SoVLd8
Channel Id: undefined
Length: 46min 6sec (2766 seconds)
Published: Fri Jun 30 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.