Day in the Life of a Data Scientist @ Amazon (London)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
foreign [Music] welcome everyone this is gonna be a day in the life of a data scientist I have two things on the agenda first going through how a Friday looks like and a second part of going through a couple of learnings that I think I would have wanted to know about what's their complexities or challenges working with this day today so let's get into it we kick off the day with getting ready and packing in a pair of extra clothing for the date night that is coming up tonight on top of that we are getting our airpods Pro 2 that we picked up recently absolutely amazing because the noise cancellation makes it really easy to get focused time in the office I have another preference even though I'm very close to the office I want to take my bike and take a little cycle to the this every morning get a little bit of cardio in it it is famous for not having the best weather London but we have our days where it's getting prettier and days like this being the Amazon office it's quite quite fast we have a bike storage on the basement floor which is the commuter area as you can see it's so nice and colorfully on the wall you can put your bike here and just go back up to the office they've recently uh set up some underexposed art pillars here showing artists and actors and singers and gave them some publicity which was really nice this is the floor that I'm working on which has all the essentials of looking like an office this is my desk that I prefer to work at it's an open floor plan but it's still one that I enjoy setting up which has this lovely view to the right so it feels very spacious and nice of your crunching numbers however today I felt like I want to go to the 15th floor which is quite famous for having a very nice view and also very open floor plan where you can sit and either talk with your team or just sit by yourself by the sofas and couches close to the window which makes it look and feel quite nice and very much coming from someone who's been living in Sweden his whole life it's a very nice experience to see this kind of big area a very aesthetic interior then we went on to get a quick coffee I do tend to pour out a little bit of water to get a little bit more of a strong black flavor and also been trying to get to drink a little bit more water day to day and try to drink at least two cups while the coffee is filling up I tried to spend also 30 minutes on Friday mornings to learn about some new topic that I haven't explored yet this day I wanted to learn more about reinforcement learning a key concept here from what I understand so far has been something that is called Marco's decision process or mdp for short it basically talks about an environment a state an action reward and observation I'll put the link to this in the description if you want to read more about this but to take a step back and talk about this is that this is a New Concept for me that I don't really know anything about and I have a way that I myself learn in the best way possible is to give examples and try to explore this in my own way so what I've done is that I've made a little table here in notion where I try to estimate different reinforcement learning examples and I try to make UPS environment States actions rewards observations and then cross-reference them to the actual answers I am a big proponent to active recall where you test yourself as much as possible to really see if you're understanding the topics and in this case I'm trying to understand how can I apply this what are the different examples before I dive into for example coding this up in some applied science next up I was working on a very common issue that I'm working very frequently within my time series forecasting which is you have data that is not totally coming through the SQL code that you're writing or the python code that you're writing it's basically a data pre-processing issue where it's either for example maybe duplicate somewhere or maybe there's missing value somewhere when you do like a join and this is something that I really didn't expect to be such a common thing that we'll be working on here at Amazon but it is so much data that you have to handle there's so many joins so many mergers that you need to kind of be on top of every time but the thing that a lot of people often reference to data science is building complex machine learning algorithms but the thing is that the most important core issue that you need to deal with first is that you need to have high quality data to be fed into the model a well-engineered data pipeline that you can then iterate your model on top of is way more valuable rather than you finding a really good model to begin with and then you're realizing how you built the whole model that actually isn't supposed to perform this way because your data is faulty or um certain assumptions that not actually reflected in reality the data pre-processing is a very important topic to get your head around and has been a really big part for my development as a tune of data scientist next up was to actually train my model on some new assumptions this is an XG boost model that has been performing quite well you see this often on kaggle so but it's actually very useful in reality as well because they don't tend to overfit on your training data which is a really good feature to have another part is to look at hyper parameters tuning which is something that I haven't had much experience before I came to Amazon with which is some idea that you're tuning your model to not maybe over fit or under fit and then on top of that also adding new features and do proper feature engineering and related to these two things is the one principle that I've found really interesting to work with which is to fail fast it means basically that you want to build through a good data pipeline you want to have it set up in a good engineered way and then you can try all of the features and all of the different things so that when you actually get your results like this you can tune them and see an iterate on your problem so you can get those sweet incremental improvements as you go I then went down to have some lunch in our Amazon restaurant it is basically a bit of subsidized food I think we pay five quid for a decent lunch I then went back to the 15th floor to this a bit more aesthetic area where I could work on some more administrative tasks with a view I do tend to enjoy having a little bit of a spacious environment switching it up from day to day to get a little bit more clear-headed what I'm working on numerical problems now I would also like to highlight the actual biggest challenge that I did not know would be such a huge Development Area for me and that is the writing of white papers it's basically a one to three page long document that you present this was the problem this was the actions we took and these were the results where you formulate your technical forecasting models or classifications algorithms to a level that a L7 L8 business person can understand and see the impact of what you've done also including risk assessments of what are the potential risks included in this type of forecasting or classification and what are the next steps formulating technical level forecasting models to a very easy to way understand has been way more tricky than I ever anticipated there's a lot of people here that have helped me to develop this skill here which has been really nice but at the same time for me it's been very very difficult to just actually adopt it it's a thing work in progress I will continue to work on it and at some day I will hopefully be able to write really well written white papers as well I then packed up my things and went down to the 10th floor and I just wanted to show off one of the areas also that I tend to go out to when I want to take a little bit of a break and want to get a bit of fresh air because since there's 15 floor you would have to get down to the bottom floor if you want to get some pressure but luckily they actually built this um almost the Zen Garden looking thing where you can go out to either have your lounge or just take a quick stroll it has a couple of trees a little bit of greenery it is in the forest but in my experience at least coming from Sweden having a lot of force around me previously it does actually helped to see a little bit of greenery day to day when you're working in a corporate office and working on a lot of like algorithmic and mathematical problems so I do genuinely enjoy having a atmosphere that is a little bit more vibrant and green rather than always just having the corporate space but then also on floor one we have this Cafe area where you can get one free cup per day a bit of frugality one of our leadership principles but to me I don't need more than one nice cup of coffee per day anyway so I got an old cloth in this case I'm of course biased since Oatley is the main choice of oat milk which is of course the Swedish oatmeal brand get a bit price on that one so cheers to that finished up some administered attacks before cycling home I had booked a restaurant for tonight for me and my partner to go out and celebrate her belated birthday because I was a little bit delayed actually giving her a birthday gift it's called samsa a Thai restaurant that is really really nice they had my Thai kick drink that was really mess with lime and some orange wine after the delicious dinner we went on to grab a whiskey sour at a random hotel which looked quite ecstatic but that about wraps it up finishing up with this sip of the drink was the end of the day it has been a productive one I have enjoyed it third myself now I hope you guys enjoyed the video if so please leave a comment and like the video and otherwise I'll see you in the next one [Music]
Info
Channel: Olle Green
Views: 34,836
Rating: undefined out of 5
Keywords:
Id: h7a0CnUaMkc
Channel Id: undefined
Length: 11min 6sec (666 seconds)
Published: Sat Oct 29 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.