Why Python is huge in finance? by Daniel Roos

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
okay hi everyone so my name is Daniel ruse I am one of the cofounders of new or Dom a small FinTech but before that I worked in the finance industry at banks and system vendors for about 20 years mainly using Python and that's the reason for this talk I've seen Python used for like a lot of interesting things really good opportunities for jobs really good opportunities to sell products to big companies with a lot of cash a lot of this is not super well known but the finance industry can be a bit complex and it can also be a bit secretive as well get you so go today is to do a deep dive into this vertical so leaving machine learning behind a bit and instead of focusing on Finance but how python is use there hopefully you'll learn something interesting and maybe come away with something that's useful personally to you at the end a demo if there's time to do some kind of trading strategies first of all finance what is it very briefly you can think of it as three things banking everything with money loans making payments asset management everything about making money grow but this is also like a routing function it's a magic that you have some money left over you put it into your account somewhere it goes into fund and suddenly a brick factory in New Delhi can buy a machine make more bricks make more money so like a global routing function insurance all about risks pricing risk spreading them around essential for like every person and also businesses okay then there's this guy like all businesses when there's money involved people get greedy so financed there are elements there's been scandals crashes but if we step away from that finance is essentially infrastructure for our society and for like making things work it's the first thing I'd like to highlight is is big it's bigger than most people think both in terms of size and impact let's get concrete this scary diagram is a map of all the 500 biggest rocks in the US the size of the squares is like the market cap of the company how big they are so you can see Google Microsoft Amazon Apple like the really big blobs this part here about 20% that's a finance industry this part basically deals about moving money around to help all the other businesses do their jobs and it's bigger than consumer goods it's same size as healthcare so when we say the finance industry it's big just remember this picture like 20% and then why should you care well like I mentioned there's interesting jobs got some stats from before 12 and 1/2 percent but since the revenues are so big they're actually quite lucrative jobs 26% of all corporate profits ok maybe you don't work in the industry but you are making a really cool new machine learning product like your customers a big part of them are gonna be finance and they're also customers that have big budgets for IT recent survey 2018 finance spends 440 billion u.s. dollars on IT 30% of that kinda goes in-house rest outhouse and it's growing finally the one I mentioned personal finances like if you're really talented you come out you work super hard put all your money in the bank and retire not the smartest thing if you do the same thing you work 2% less spend those 2% managing your money in a good way you could like maybe double what you retire with so not doing anything about it is shooting yourself in the foot that's just good to be aware of everybody has to be involved in finance so you might as well know the rules and it can be really fun to now okay that was finance so back to why we're here Python I'll highlight some areas where I've seen it used like interesting ways first one trading at risk you've probably seen these kind of pictures on the news there's like this big trading floor at a bank and they're asking about the economy these guys are trading commodities FX currencies etc first thing you might wonder why do they need like 12 screens I have no idea IV in there they don't switch thanks very well second thing what are they running there what it all looks like Excel what are they doing there what they're running or usually trading systems and these are systems that are really complex they're actually too complex for the bank's to build themselves so there's a couple of global vendors that build them these systems handle like all kind of calculations huge amounts of data and in super real-time like you know milli micro second they are millions of lines of code ports of you have PhD level math and they're super expensive can easily be two thousand US dollars per user per month and they need to be customizable so the one I'm super familiar with its front arena it's built here on Sodor mom by about 200 developers originally Swedish software now US company and back in 99 they made the decision like we need to customize this we'll put in Perl or Python so luckily they went for Python which was a good long-term bet and nowadays like brunch of the business logic is developed by Python this company has courses where they teach people in finance a to use Python thousand plus people certified so more technically Python embedded in the C++ application lets you combine the raw speed of like C++ with ability to go to a new market and then customize how trade surprised or local workflows Python code is stored in the database you can do changes instantly you have a solid application delivered by vendor you push new Python code into the database and things update in real time now moving to the secretive parts I said some parts of finance or not that kind of well known you have the hedge funds and the quants the idea here is to beat the market and make money so basically you're saying we're gonna figure out some mispricing and we will just make money we will look at data have a lots of ideas evaluate them and then make better returns and investing in the stock market and of course if you know how to do this you're not going to tell anybody because if everybody does it you know there's no money to make and these are like they invented big data before there was big data they're all over machine learning and AI you know exactly when it was happening and this can be things like taking you know crazy stuff like taking aerial photographs counting cars in parking lots with some kind of algorithm and trying to figure out if the store is doing good business before the report comes out so like really fun stuff the little picture on the side here they're like horrible ugly webpage is a company I think second biggest hedge fund they manage 110 billion dollars in math assets like super sophisticated company and their website basically says hey we manage money no email no telephone numbers just an address to send legal documents so this is kind of the way it works and then you can read forums and stuff and figure out which ones are good which ones are not and you know what they're hiring for etc and that just one anecdote their books hedge funds are fun when you think speed like Python they will typically use Python to develop the algorithm then they'll recode it in C++ to be fast books Python is slow and I'm gonna like state that regardless of like what library is a pie pie or whatever in this context Python is very slow so the normal cycle is Python code it is C++ if it's very time sensitive you code it in hardware C build an FPGA if you're doing something that geography so you could say that we're gonna orbit Rajesh between London and Frankfurt so kind of try trade the same things in two places and can I find miss pricings they will build these networks so the map you see there of Europe is microwave towers so light travels faster as microwaves and in optical fibers so if you can go from London to Frankfurt faster than anybody using a normal internet line you can beat them and you can make like a profit so it's like three or four companies that have built the shortest path they can find of microwave towers to cut a couple of microseconds and kind of achieve an advantage there it sounds like the final level when you're going to be really really fast it's not all dark magic though you can actually try this out so people discovered crowdsourcing and that there's a lot of smart people out here that can apply new techniques to solve these problems so here's two companies and they basically work on the same premise they give you like all the tools all the data you can write up an algorithm so maybe you think that if I buy a stock and Monday and sell it on Friday it's always going to be profitable yeah exactly it's a bit harder than that nowadays then they back just that and then the magic comes if your algorithm looks good after all they're like testing they will run it with real money and give you like 10% of the profits so like zero risk for you if you think you can beat the stock market by writing algorithms and it's all invite them by the way when they needed to do this they picked Python to like be the language that they roll out to people to quickly and easily learn and then build algorithms so good fun things to look look up at home also just a very good example of how you can use Python to provide like a high very high level API like do something with stock trading switching crack open banking met some people mention this at lunch is getting well-known so regulation changes the governments have realized that your data is very valuable and it shouldn't be locked up at the banks so we recently got laws called PSD to saying that the banks have to share their data make them available the api's then you get new companies coming up around this one company is tink how many have heard about Inc okay located over there by central along for the few of you hadn't heard about it they make an app that looks at your spending and basically tells you you're spending three thousand kronor some coffee and then you can make a decision if that's good or bad I guess they're using Python so I interviewed them and sure they are there are typical ml case they use Python what they appreciate about it is like that it's easy to get people started great ml support and that they can also roll out the same models in production so it's like develop good deployability not going to get into that too much it's basically rehash but good example where it is happening last use case I'll pick our own company startups so when you're stored up you have to do a lot of things with little resources and you need to be super efficient so what we do we're called new order it's similar to change but is through your finances so coming from the banking industry we want to tell people not to get ripped off so we built an app that collects all your savings pensions and investments and then analyzes it with like professional professional risk and so you can see like what's happening what you could do how you could save on fees so tinker investments but to do this we needed you like collect a lot of data clean it up manage complex data models serve api's and do really fast risk calculations with a team of six people in six months so then we reach for Python and which has really great support for this straight through talking to other startups pretty much every startup I've seen here or advised in the FinTech industry has Python elements in it so very very good penetration on new companies that can choose a fresh stack at the banks you still have COBOL etcetera for legacy reasons but if you're doing a new company Python is very very much gonna be in your focus or finance okay those were like couple use cases they also give you a flavor now let's step back and summarize so I'm saying Python it's gonna used everywhere why is this what's kind of what makes it interesting first one it's actually like really really old school it's that Python is easy to learn I couldn't find a quote but I believe it's designed to be easy to read and that a developer should be able to pick it up in one day if they know another language it was a bunch of like early initiatives around this and since early competition was C++ Java etc which are much more difficult this was an advantage so back in the 2000s when I was working in front green and selling this trading system we were coming into you like big banks and telling them you're gonna have to learn this language called Python you've never heard of it and then we do demos and we do live coding and they say okay I I understand what's happening and they pick it up something like you know C++ call Haskell etc while Greek languages have a big learning curve I think the old slogan batteries included is still true and what path I saw something called a glue language so this is less glamorous but typical Bank this is kind of like a sister map it continues continues for quite a bit so this is like 50 systems there's a lot of glue you need you to pour some xml file call a soap service put it all together send it up on an FTP server and you're behind the firewall so you can't access the internet and to get a package delivered you have to have IT screen it which takes two weeks so then having a language where everything is just built in it just works you can deploy it huge advantage so just simple work get stuff done is a big value in a big complicated organization I think this is a big one so good mature libraries for the numerical scientific I think we already seen pandas and numpy all over for the machine learning they actually started their life for finance so pandas came from AC or Capital Management so kind of like a hedge fund that needed to do a lot of time series analysis and that's one reason it's perfectly well suited it handles like you know filling missing data I'll demo that but also handles like weird stuff like proper business day handling super boring but if you actually do something and you're working against London and New York you only want to look at the dates that are business days in both countries so easy stuff to filter out they are not covered by like you know your standard date library so many platforms specialized ones we talked about like the algo trading so there's like really good libraries for doing robotic trading etc that come from these companies I was talking about earlier and I've been open sourced finally really good Excel handling just being solid on the on the basics or a big plus speed I said Python was slow it is but embedding C++ lets you add speed words needed that's why numpy is really fast it's using you know specialized compiled things mkl Intel library looking on other platforms I'll try to do similar things he likes call and Java you kind of hit it's possible but the compile things are horrible instead of just having a pre-built library that's like super optimized super fast so as long as you vectorize things Python can stretch further than you think talked about embedding already future here I've seen a lot of butts size that number etc not seen it use live so much but there's some exciting parts there about actually being able to write things in Python and then having it be really fast so look super promising haven't seen it like that much okay not reassuring this one ml a I of course super big in finance there's like a bunch of great use cases I will just say one thing that this explaining why model does something is extra important in finance it's very very Compliance have a very very regulated so if you say no to somebody about a loan and there's any risk that your AI is doing this because of gender or race or something huge problem so grab me at a break if you want to talk about it but of course this is a big area where Python has a big lift from being strong in the ml space and then so this isn't totally one side it's a why not Python Python the speed we mentioned talent pool it is actually a smaller community so big you know huge thing we're gonna rewrite our core banking system we need to hire 300 developers we can get it the Java or me in or the consultants can serve us like hundreds of Java developers or or we sure we can staff it with that many senior Python back-end developers so I would say this is a breaking factor right now we're stopping factor I've been on advisory boards for startups for people plainly said don't even try it go for node instead or something else because it's hard to hire people so but here talking about big companies so the more people we get the better it's gonna be there long term some ports like portal banking is like these really big enterprise applications I think you basically need static typing to have like hundreds of people working on something huge in business critical I haven't seen anybody build that kind of thing in Python not saying it can't be done but you would need super talented people so when you need to build something super big and complicated you tend to reach for you know Java or something which is well known and supported for like big big project development may change in the future okay that was the basic slides now and now I wanna do an example so you probably heard like don't put all your eggs in one basket or like if you buy stocks you shouldn't buy just one you should buy several who's kind of heard that kind of advice yes good so let's have fun then you can try that out yourself a so what I've set up is typical example so using Jupiter notebook want to highlight a couple of things here normal parts importing data okay first one when you do this you need data so luckily there's a really good library called it's part of the pandas pandas data reader this connects automatically to like a bunch of standard sources boom load Apple data what did I get asking for data for Apple like the stock price I actually got a lot of things here and this is kind of welcome to finance you have a opening price closing price high low for each day some other stuff over here cool adjusted well we'll get into that and then of course I can get one of them and plot it you always check your data so looks great I have Apple there surprise let's look at another one Saul do you visa for a bit further back see a huge jump so what's happened here any guesses yeah a stock splits so sometimes when the stock price gets too high the company decides we want to make a cheaper for people to buy we'll divide the price by seven and give everybody seven times their stocks so this will I kill your data analysis you're doing some prediction a Sunday the price drops by seven so of course there's like people that clean this and you're actually getting that data for free from these open sources so it's called adjusted price so then somebody back calculus is fills the old data and you have a like continuous series you can use for analysis by the way how many people here owns or trade stocks in any way awesome utility function to just load this data I cash it give me the close price not super interesting now let's look at a basket so I'll take three checkers build a data frame from them boom so not super useful I'm looking at like three big tech stocks here just books they're familiar names and of course I want to look at the kind of relative changes instead so with pandas it's all vectorized in order to take this and have all these start at one and just show me the relative change it's a one-liner take the historical price divided by the first row of the prices does exactly what I want so now I can see how they performed in 2017 which was like a great year to invest Google was actually doing worst of them there so let's say my strategy is I will buy an equal amount of each stock just to keep it simple in practice I would do this like weighted so you do various percentages to figure out what the optimal mix is but let's just say I'll go 33% of each one then take the returns sum them up create a new column divide by the number of stocks and there is a glitch so what the hell is happening here there's like a huge spike there in my portfolio looks strange and I can see there's a little gap in Apple there so some kind of price is missing this is like super typical for Finance welcome to your day job so you're getting data from all these vendors and you know there's a flooding or the stock exchange wasn't open or something happened with this company and the price is missing so always like check your data so pandas again has really good stuff just okay I want to feel all missing daily I'll just back village I could interpolate or whatever old Milton so for like an analyst this makes life so much easier boom now we've cleaned data if I hadn't done that actually have the old values here when I estimate the risk for them I'd be off by a factor four so one missing data point will tell me my portfolio is three times riskier if I hadn't fixed that so what I'm doing here is I'm calculating volatility that's a matter of how risky a stock report Foley is to do that I need to calculate the standard deviation of the log normal returns which sounds horrible but because with pandas it's very close to like math I can basically write log off returns percentage change boom and then I can see my expected result that the volatility for Apple Facebook Google or 1717 fifteen percent my portfolio is 14 so yes by mixing the stocks I get a lower volatility great and you can see that intuitively on the chart like the red line here which is my portfolio same kind of general movement but the jumps are smaller so it's kind of safe for investment okay then of course I want explore I'm not happy maybe why these three stocks maybe I had a couple of favorites but I can't buy all of them so I want to kind of play around with it then I feature a love in here it's like interact so I can build a simple function this function here basically just grabs a couple of symbols does exactly what I did before and plots this portfolio but will interact in pandas I get this so like a mini dashboard I think similar to what plot K was describing less advanced and built in but really easy to just get started with so now I can see the thick line here is my portfolio and these other lines I just made them different color or the same color or the stocks and here I can remove them so I can say well what if I were just by Google well my profiles can be Google if I mix in IBM I get this kind of thing so really really simple way to build something that's interactive where you can explore and understand different stock strategies of course this is check all sorting from one date so that's you know a little be s of course this year isn't going to repeat exact place I should study many starting dates it's called back adjusting so you take some kind of strategy in this case the strategy is by equal amount of stocks and you test it two different dates so first I can do that visually I set a starting date here so I can drag the slider and as I'm dragging it I'm getting you start dates and I can manually scan this and see who ply here something like interesting cases or things that look weird just to build an intuitive understanding of how my strategy would behave had I started on different days finally and this I'll skip through a little bit I can do something more statistically sound so typically once you have your first thing you'll do a deeper study so I will look at the stocks are they really correlate how they're connected returns and then here I'll do a proper I think I said in talk like Monte Carlo simulation so picking random days combining them to get a statistical equivalent simulation this is one way of doing it so basically I'm picking random days from the past history like several years history saying that it's probably going to be similar but not exactly the same and now we're getting heaviest is doing this for a thousand times 60 samples so 60,000 samples boom very fast books it's all like vectorized c++ and then you can even plot this so now this takes a bit more books like it's a lot of lines what you're going to see is a chart showing a thousand different simulations of what this kind of mix stock portfolio would look like so getting into the yeah here we go boom boom and this is what you would expect so we're starting at one like you know same value general it's going upwards we have some real extreme outliers so hey I could lose you know 15 percent here I could gain 25 percent here extreme cases it's a really good way of Jack's sampling and seeing what the extremes are and then of course presenting it as quantile so I'm saying like hey if I do this strategy if kind of the stalks have similar volatility so like a lot of ifs here but it's this is kind of the way banks work too but with a bit more data then I can be 95 percent certain of being in this span and my probable outcome is 7% return which is like pretty okay for this period so that's it for the sample point I want to get across is if you just want to play around you can it's all pandas just grab the data from the standard sources there's like a ton of tutorials when I post the slides I'll put some good links in there and if you like you get this orange letter from the government saying like your pension is gonna be worth this much then you can just double check that do you like your own pension assessment or stock trading so before the cure wrap up finance is really big 18% of the economy it's Python is like used pretty much everywhere there so as a developer there you have a extra good opportunity in as a business these are probably lucrative customers if you're doing anything with like data compliance etc partha success it's been like fast learn great libraries and now ml is just giving it an extra boost and if you think it's interesting spend some time on it and just if you learn something and you can ask better questions the next time bank the bank tells you to invest in a particular fund go for it we are oh yeah I have to add that we're not hiring books were a small start-up so currently were fully staffed but if you kind of like our mission and helping everybody manage their finances better talk to me and you know we're probably expanding next year that's it thank you Wow we have our first question already hey thanks for the presentation so I remember you raised some concerns regarding the static typing situation with Python I mean recently they started having some efforts in new things like my mom and I adding some further static checking and like last time I checked and I actually used it it is it is evolving quite well but I wanted to ask in your experiences if you have seen it on the wild or how you see it yes well I have not and in general I've seen people who work in these places or like super business focused super busy so they often don't have the time to do like you know the really deep dives and innovations in setting up the latest news the infrastructure they'll just grab it use it and kind of solve a real problem so probably not leading edge there but I do see that as like the long-term solution like once they're seeing the Python is all over anyway then if they can get it stable or like stabilized with something like that maybe you can big this build these monster applications on it and there I really I'm talking about like you know the army of like 200 consultants all working together and huge builds that kind of environment so any more questions hello like finance is really like delegate like regarding to if I make a mistake then a lot of people are going to be very angry like how do you handle like human mistakes like how do they handle finance yeah that's actually a very interesting question because you're right like kind of the places I work like if some price feed goes down you don't detect it you're losing a hundred thousand dollars like BAM book somebody's going to see your mistake and they're gonna trade on it I the kind of answer is I guess very carefully and knowing what what's important like you're doing some kind of internal report she can go really fast like you know we're quickly you're doing something that modifies the prices you're quoting to the market you know it's super dangerous you get a lot of people to check it it's having a very good feeling for when to go fast and when to just be very slow and very careful and then an enormous amount of monitoring also having manual people in the check yeah in the loop so often we drag one project we did like a go live and we had new products going out and I think we're all sent human-made office like a third of the deals had to be backed out but there was a person manually checking things before they went out and you're in the go-live period they could do that and then once you scale up you remove the manual person so that's one definitely way last question thank you and I would like to ask you in combination with Python what would be the second element that you would suggest someone to study if they want to get deeper into the fine tech field right right that depends on the area if you're going for like the hedge funds high frequency trading C++ that kind of thing if you're going for like you know just more operations like being really good at Excel can actually be good I mean it's used so much that if you have like you know superpowers there it's really good but it totally depends on where you're going but for the hardcore stuff C++ maybe Scala Java is a good compliment okay that was it for the questions yep thanks yes [Applause]
Info
Channel: PyCon Sweden
Views: 128,407
Rating: 4.957983 out of 5
Keywords:
Id: kBwOy-6CtAQ
Channel Id: undefined
Length: 31min 51sec (1911 seconds)
Published: Wed Nov 27 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.